F/e  15/5 


' O-A069  026  OFFICE  OF  NAVAL  RESEARCH  ARLINGTON  VA 

NAVAL  RESEARCH  LOGISTICS  QUARTERLY.  VOLUME  26.  NUMBER  l.(U> 

MAR  79 


« 


UNCLASSIFIED 


NL 


SP< 


1L~-- 


_ 


I 

I 

I 


CONVEX/STOCHASTIC  PROGRAMMING  AND 
MULTILOCATION  INVENTORY  PROBLEMS 


Uday  S.  Karmarkar 

University  of  Chicago 
Chicago.  Illinois 

ABSTRACT 

This  paper  examines  a convex  programming  problem  that  arises  in  several 
contexts.  In  particular,  the  formulation  was  motivated  by  a generalization  of 
the  stochastic  multilocation  problem  of  inventory  theory.  The  formulation  also 
subsumes  some  "active"  models  of  stochastic  programming.  A quantitative 
analysis  of  the  problem  is  presented  and  it  is  shown  that  optimal  policies  have  a 
certain  geometric  form  Properties  of  the  optimal  policy  and  of  the  optimal 
value  function  are  described. 


The  problem  studied  in  this  paper  is  a special  case  of  the  minimization  of  a convex  func- 
tion under  linear  constraints.  The  formulation  is  motivated  by  the  multilocation  distribution 
problem  of  inventory  theory.  Much  of  the  literature  in  the  multilocation  area  assumes  a special 
structure  in  the  distribution  system— typically  a multiechelon  structure  is  assumed  and  restric- 
tions are  placed  on  the  available  activities.  The  model  studies  here  is  quite  general  and  will 
allow  shipment  between  all  locations,  disposal  activities,  capacity  constraints,  and  multiple  pro- 
ducts. In  effect,  no  special  assumptions  need  be  made  about  the  nature  of  the  constraint- 
coefficient  matrix. 

The  paper  on  stock  redistribution  by  Allen  [2]  is  perhaps  the  earliest  examination  of  a 
general  multilocation  problem.  A more  general  version  of  the  distribution  problem  was  posed 
by  Gross  [10],  and  a complete  specification  of  the  optimal  policy  was  obtained  for  a two- 
location  example.  Krishnan  and  Rao  [16]  have  tackled  a one-period  problem  similar  to  that  of 
Gross  except  that  transshipment  decisions  are  deferred  till  after  demand  is  realized. 
Elmaghraby  haj  studied  a stochastic  transportation  problem  [8]  in  which  stocks  of  a product  at 
source  locations  are  to  be  allocated  to  a separate  set  of  demand  locations.  The  same  problem 
was  studied  by  Williams  [26],  who  later  extended  his  study  to  a more  general  stochastic  pro- 
gramming formulation.  The  stochastic  transportation  problem  has  also  been  studied  by 
Balachandran  [3].  A problem  similar  to  that  of  Gross  has  been  studied  by  Das  [7]  with  the  ord- 
ering and  transshipment  decisions  made  separately;  he  derives  the  optimal  policy  for  the  two- 
location  case.  Finally,  Karmarkar  and  Patel  [14,  15)  and  Karmarkar  [11,  12]  have  studied  the 
general  problem  posed  by  Gross  using  the  methods  described  in  this  paper.  The  results 
obtained  by  Gross  are  recovered  and  several  other  examples  of  two-location  problems  are 
solved.  They  also  extend  the  results  to  larger  problems  and  demonstrate  a computational 
(column  generation)  method  suggested  by  the  approach  on  a 5-location  problem  first  posed  by 
Skeith  [22]. 


1 


79  05  24  076 


2 


U S.  KARMARKAR 


The  problem  studied  in  this  paper  is  more  general  than  the  models  mentioned  above. 
Other  published  multilocation  inventory  models  are  more  specialized  than  those  above  and  will 
not  be  discussed  here.  They  are  described  in  extensive  surveys  of  the  literature  in  multiloca- 
tion  inventory  theory  by  Clark  (4],  Aggarwal  (1],  and  Karmarkar  [11], 

In  another  sense,  the  problem  can  be  thought  of  as  a generalization  of  the  stochastic  pro- 
gramming model  studied  by  Williams  [27,  28].  Qualitatively,  the  formulation  also  subsumes 
the  two-stage  stochastic  programming  models  of  Dantzig  [6],  Wets  [24,  25],  Walkup  and  Wets 
[23],  and  others,  although  the  special  features  of  two-stage  problems  are  not  explored  here. 
The  next  section  presents  the  problem  to  be  studied,  and  gives  examples  to  show  its  relation  to 
stochastic  programming.  The  rest  of  this  paper  consists  of  the  qualitative  study  of  this 
convex/stochastic  programming  problem  using  the  tools  of  convex  analysis.  The  notation,  ter- 
minology, and  general  approach  followed  are  those  of  Rockafellar  [19]  and  are  briefly  intro- 
duced in  the  appendix. 


CONVEX/STOCHASTIC  PROGRAMS 


Consider  the  convex  programming  problem 


(CP) 


v(x)  - inf [cz  + L(y)) 

*.y 

subject  to  (s.t.)  Az  - y - x, 


z > 0. 


Here,  x and  y are  a- vectors,  c and  z are  m-vectors  and  A is  an  (n  x m)  real  matrix.  The  func- 
tion L from  R"  to  the  extended  reals  is  a closed,  proper  convex  function  defined  over  all  of  Rn 
although  dom  L may  be  some  proper  subset  of  R 


This  formulation  can  be  interpreted  in  terms  of  the  one-period  distribution  problem  in  a 
natural  way.  Let  x be  a vector  of  initial  stocks  held  at  n different  locations.  A stocking  decision 
is  to  be  made  which  consists  of  specifying  a "target”  stock  vector  y and  the  set  of  activities  z by 
which  this  level  is  to  be  reached.  The  cost  of  achieving  the  target  level  is  given  by  cz.  On 
observation  of  the  demand  in  the  period,  a cost  of  L O')  is  incurred.  An  example  of  this  type 
of  problem  is  given  in  Karmarkar  and  Patel  [15].  Suppose  that  in  addition  there  are  constraints 
on  the  activities  z given  by  A2z  < b and  that  y is  to  be  restricted  to  some  set  Y.  Then  the 
problem  can  be  converted  to  the  given  form  by  adding  slacks  s to  the  new  constraints (and  by 

defining  z1  - (z.s),  A1  - L J.  y'  " O'J').  *'  - (x,0),  L(y')  ~ L(y)  + «0'|T)  + 


8(.p|.p  - b ).  Other  examples  of  this  type  of  problem  are: 


EXAMPLE  1:  Let  L(y)  - £ /(O',.  7,),  where  l,(y„  y,)  - a, (y,-  y,)+  + 0,O\  - 7,)" 

( 

This  formulation  is  called  goal  programming. 

EXAMPLE  2:  Let  L(y)  - J E(  l,(y„  ( ,),  where  /,(•,  •)  is  as  defined  above  and  f is  a ran- 

i 

dom  vector.  This  is  termed  stochastic  programming  under  simple  li/tear  recourse  and  is  the  prob- 
lem studied  by  Williams  in  [26]. 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


3 


EXAMPLE  3:  Let  L(y)  - £ Efjtiiyi.O',  f is  a random  vector  and  each  g, (■,(,)  is  a con- 

i 

vex  function  of  its  first  argument  for  any  fixed  f ,.  This  case  will  be  called  stochastic  program- 
ming with  simple  convex  recourse. 

EXAMPLE  4:  Let  L(y)  - C(y-d),  where  d is  a given  /i-vector  and  C(  ) is  defined  by 
C(x)  - min  c'z',  subject  to  A'z'  - x;  z'  > 0.  This  is  an  example  of  a multistage  or  dynamic 

linear  program. 


EXAMPLE  5:  Let  Z-  O')  — EfC(y  - {),  where  ( is  a random  vector  and  C(  ) is  as 
defined  above.  This  is  called  the  complete  recourse  stochastic  programming  problem. 

EXAMPLE  6:  Let  L(y)  - L'(y)  + aE(v(y  - f);  where  L'iy)  is  a closed,  proper  convex 
function,  O < a < 1 is  a discount  factor,  and  v(-)  is  as  defined  in  (CP)  above.  This  problem 
will  be  called  a multi-stage  convex/stochastic  program  and  is  discussed  by  1'armarkar  [11,  13]  in 
greater  detail. 


The  following  sections  contain  a qualitative  analysis  of  the  problem  (CP)  aimed  at  investi- 
gating the  structure  of  the  problem,  characteristics  of  the  solution  set  and  the  nature  of  the 
function  v(x). 


DUALITY  * OPTIMALITY  CONDITIONS 


It  can  be  seen  immediately  that  the  problem  (CP)  can  be  rewritten  in  terms  of  the  vector 
y as  an  "unconstrained"  minimization  problem 

(UCP)  v(x)  - inf \c(y-x)  + L(y) 


where  C(y-x)  is  implicitly  defined  by  the  linear  programming  subproblem 


LP(y-x) 


C(y-x)  «■  inf 


cz:  Az  ^ y - x.  z ^ 0 


or  equivalently  by  the  dual 

LD(y-x)  C(y-x)  - sup  |ir(y-x):  n A < cj. 

The  representation  in  (UCP)  follows  since  the  choice  of  vector  y in  (CP)  leaves  z to  be  chosen 
exactly  as  in  LP(y  - x).  This  step  is  termed  a projection  and  does  not  assume  any  special 
structure  in  the  problem. 


We  also  define  the  related  convex  subproblem  (CS)  as 
CS(n)  - L*(-n)  — infjiry  + L(y)| 

and  denote  a vector  optimal  in  this  problem  by  y°(n).  We  note  that  y°  solves  the  problem  iff 
iry  + L(y)  is  subdifferentiable  at  y°  and  if  0 is  in  the  subdilferential  to  vy  + L(y)  at  y°.  L* is 
by  definition  conjugate  to  L and  is  therefore  convex. 


t 


i 

* 


4 


U.  S.  KARMARKAR 


The  problem  as  posed,  without  any  assumptions,  may  not  possess  a solution,  it  may  be 
unbounded,  or  if  bounded  the  infimum  may  not  be  attained.  To  examine  the  structure  of  the 
problem  in  greater  detail,  consider  the  Lagrangian: 


f(n,x)  - inf  [cz  + L(y)  - n(Az  - y + x)} 

».i>  o 

- inf{(c  - irA ) z]  + inf  \L(y)  + n(y-x)) 

*>0  y 

The  first  term  on  the  right-hand  side  is  zero  for  it  A < c and  is  elsewhere.  The  second 
term  can  be  written  as  - L*(-n ) - irx  Therefore 


f(n,x) 


-L*(-n)  - nx,  it  A < c; 
-oo  otherwise. 


The  dual  problem  (CD)  which  involves  maximizing  a concave  function  over  a convex  set  can 
be  stated  as 


(CD)  d(x)  - sup(-irx  - L*(-n)) 

W 

s.t.  it  A < c. 


This  dual  problem  can  also  be  formulated  via  Fenchel  Duality  which  has  been  shown  to  be 
equivalent  to  the  Lagrangian  approach  (Magnanti,  (171).  To  see  this,  first  let  C*(tr)  be  the 
indicator  function  to  the  set  of  it  satisfying  it  A < c,  i.e., 


C*(ir) 


0,  it  A < c; 
+oo  otherwise. 


PROPOSITION  1:  C*is  the  conjugate  function  of  C. 
PROOF:  We  have  C(y)  - sup  tty 

W 

s.t.  it  A < c 


if  the  dual  feasible  region  is  assumed  to  be  nonempty.  This  can  also  be  written  as 
C(y)  - supfiry  - C*(ir)J.  Hence  Cis  the  conjugate  of  C*and,  since  Cand  C*are  closed,  C* 

W 

is  the  conjugate  of  C.  It 

COROLLARY  I:  C(y-x)  is  conjugate  to  C*(ir)  + irx  for  fixed  x. 

PROOF:  CCv-x)  - supjiry  - C*(ir)  - nx).  I I 

If 

We  can  now  write  (UCP)  as 

v(x)  - inf|c(y-x)  - l-L(y)l| 
and  the  Fenchel  dual  to  this  problem  is  (Rockafellar  [19]) 

( VCD ) d(x)  - sup|-L*(-ir)  - C*(ir)  - ir(x) 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


5 


The  argument  of  L*  is  -ir  since  -L*(-ir)  is  the  concave  conjugate  of  the  concave  function 
-L(y)  which  is  defined  exactly  as  in  CS(n).  Applying  Fenchel's  Duality  Theorem  (Rockafel- 
lar  [19])  to  (CD),  we  have 

PROPOSITION  2:  For  the  dual  programs  (CP)  and  (CD),  v(x)  - d(x)  if  either 

(a)  ri(dom  L)  n dom  C ^ or 

(b)  ri  (dom  L*  n (-dom  C*)  * #. 

If  (a)  holds,  then  there  is  some  ir°  such  that  d(x)  — /(ir°,  x),  and  if  (b)  holds  then  there  is 
some  y°  such  that  v(x)  - C(y°  - x)  + L(y°).  If  both  conditions  hold,  then  the  inflmum  and 
supremum  are  both  attained  and  are  necessarily  finite  and  equal. 

PROOF:  The  proof  follows  from  direct  application  of  Theorem  31.1  (page  327)  in 
Rockafellar  [19].  I I 

The  minus  sign  in  condition  (b)  arises  because  of  the  sign  of  the  argument  of  L*in  the 
dual  problem.  The  proposition  shows  when  an  optimum  solution  exists  to  the  primal  and  dual 
problems. 

We  will  now  examine  the  characterization  of  optimal  solutions  to  the  problem.  It  will  be 
assumed  hereafter  that  the  dual  interior  condition  (b)  is  satisfied  so  that  the  optimal  value  is 
bounded  from  below  and  is  attained  in  the  primal  problem  for  some  y.  Furthermore,  there  is 
no  duality  gap  and  d(x)  - v(x).  We  note  that  (b)  does  not  depend  on  the  vector  x that 
parameterizes  the  problem.  Let  us  also  assume  that  the  primal  problem  (CP)  has  a feasible 
solution.  This  in  conjunction  with  the  bound  below  ensures  that  the  optimum  is  finite. 

The  Kuhn-Tucker  optimality  conditions  for  (w°,  ^°,  z°)  to  be  optimal  in  (CP)  or  (UCP) 
are  given  by 

(i)  — ir°  € dL(y°) 

(ii)  ir°A  < c 
Az°  - y°  - x 
z°>0 

(iii)  [c  - ir°A  ] z°  - 0 

These  conditions  are  sufficient  for  optimality  in  the  problem,  since  it  involves  minimizatibn  of  a 
convex  function  over  a convex  set.  They  are  necessary  if  the  primal  interior  point  condition 
(a)  is  also  met  so  that  the  optimum  is  achieved  in  the  primal  and  the  dual. 

It  can  be  seen  that  (ii)  and  (iii)  are  simply  the  optimality  conditions  for  the  dual  pair  of 
linear  programs  LP(y°  — x)  and  LD(y°  - x).  Since  ir°  is  optimal  in  the  dual  program,  it  is  a 
subgradient  of  CO)  at  (y°  - x).  Similarity,  the  condition  (i)  applies  to  a vector  y°  optimal  in 
CS(ir°).  Thus  the  optimality  conditions  can  be  interpreted  as  saying  that  (y°,  ir°)  is  optimal  if 
— sr°  € fit  O'0)  and  ir°  € 8C(v0  - x).  We  may  take  the  optimal  activity  vector  z°  to  be  impli- 


6 


U S.  KARMARKAR 


The  solution  is  unique  if  the  dual  problem  (LD)  is  nondegenerate  at  7r°. 


We  may  also  interpret  the  optimality  conditions  in  terms  of  the  dual  problem  by  examin- 
ing the  dual  problem  in  the  "unconstrained”  form: 


(UCD) 


d(x) 


-L*(-ir) 


The  optimality  conditions  for  this  problem  require  that  there  exist  a ir°  such  that  the 
subdifferential  of  the  function  l-L*(-n)  - C*(n ) - irx]  at  ir°  contains  0.  Let  >>0-  >> (ir°), 
which  is  the  solution  to  (CS)  for  n - n°.  Then  y°  is  a subgradient  of  -L*  at  -7r°  when  L is 
closed  (Rockafellar  [19]  p.  218).  Given  that  dom  L*  n -(dom  C*)  is  nonempty,  we  can  write 
that  subgradients  of  the  objective  function  are  given  by  0>°  - Ac  - x),  where  Ac  is  a subgradient 
of  C*at  7T°.  Since  C*is  an  indicator  function,  its  subgradients  are  exactly  the  normal  vectors  to 
dom  C*  - [ir\nA  < c}.  Hence  optimality  in  the  dual  requires  that  .y0  - Ac  - x - 0 at  some 
ir°.  This  says  that  y°  - x - Ac,  or  that  y°  - x must  lie  in  the  normal  cone  to  dom  C*  at  rr°. 
This  implies  that  ir°  is  optimal  in  LD(y°  - x);  this  and  y°  optimal  in.  CS(ir°)  are  equivalent  to 
the  Kuhn-Tucker  conditions. 

EQUIVALENT  STOCHASTIC  PROGRAMS 


In  [18],  Parikh  formulated  an  equivalent  stochastic  program  to  the  simple-linear-recourse 
problem  of  Williams.  The  interpretation  of  the  equivalent  program  was  that  activities  available 
at  the  recourse  stage  could  be  made  available  at  the  initial  stage  without  altering  the  problem. 
This  section  presents  a generalization  of  such  equivalent  programs  for  any  arbitrary  convex  L. 
This  construction  does  not  appear  to  be  useful  in  solving  the  problem,  but  it  provides  some 
insight  into  the  structure  of  the  problem  and,  as  in  Parikh’s  analysis,  may  yield  qualitative 
results  in  particular  cases. 


The  basic  idea  is  simply  that  adding  redundant  constraints  in  the  dual  does  not  change  the 
problem.  It  will  be  assumed  as  before  that  the  dual  problem  (CD)  satisfies  the  interior  point 
condition.  This  assumption  does  not  depend  on  the  particular  value  of  x,  the  "initial  stock"  vec- 
tor by  which  the  problem  is  parameterized.  The  assumption  also  ensures  that  v(x)  — d(x)  and 
that  the  optimal  in  the  primal  problem  (UCP)  is  attained  for  some  y. 

Suppose  now  that  we  add  to  the  dual  problem  (CD)  a set  of  linear  constraints  oB  < d 
that  are  redundant  in  the  sense  that  a-  € dom  L*  implies  that  <rB  < d.  Clearly,  the  problem 
with  the  added  constraints  is  completely  equivalent  to  the  original  problem.  Let  us  denote  the 
new  problem  (CD') 

(CD)  d'(x)  -suplo-x  - I*(«r)| 

a 

s.t.  —oA  < c, 
oB  < d. 

The  problem  can  be  written  in  primal  form  where  -o  — it: 

(UCP)  -v'(x)  - inffC'Cy-x)  + L(y)), 

» 1 y 


<"  L 

I A 

If 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY  7 

where 

LD'(y-x)  C'(y-x)  - max  ir(y-x) 

IT 

' s.t.  it  A < c, 

-irB  «£  d, 

or  by  writing  C'(y-x ) explicitly  in  primal  form, 

(CP')  v'(x)  - inf {cz  + du  + L(y)} 

y.z.u 

S.t.  Az  - Bu  — y—  x, 
z,u  > 0. 

This  problem  and  the  original  problem  (CP)  are  equivalent  in  the  following  sense: 

PROPOSITION  3: 

(i)  v(x)  - v'(x) 

(ii)  If  (it0,  y°,  z°.  u°)  is  optimal  in  (CP')  or  (CD')  then 
(ir°,  y°  + Bu°,  z°)  is  optimal  in  (CP)  or  (CD). 

PROOF:  The  first  assertion  follows  from  the  complete  equivalence  of  the  dual  problems 
and  from  the  dual  interior  point  condition  (which  ensures  that  no  duality  gap  exists) . 

tt  is  also  clear  from  the  equivalence  of  the  dual  problems  that  the  same  ir°  will  be  optimal 
in  both  problems.  More  explicitly,  we  will  show  that 

(a)  ir°  is  optimal  in  LP(y°  + Bu°),  and 

(b)  .y0  + Bu°  is  optimal  in  CSP(ir°). 

The  second  condition  shows  that  (y°  + Bu°)  is  a subgradient  of  -L*(-ir°)  at  — ir°,  and  the 
first  condition  says  that  (y°  + Bu°  — x)  lies  in  the  normal  cone  to  dom  C*at  it0. 

To  show  (a),  we  use  the  fact  that  n°  is  optimal  in  LD'(y-x).  By  partial  dualization  on 
the  added  constraints,  using  the  optimal  multipliers  «°,  we  have  that  ir°  is  optimal  in 

max  ir(y°  - x)  - (-it B - d)  u°  - max  ir(y°  + Bu°  - x)  + du° 

w n 

S.t.  IT  A < C S.t.  IT  A < c. 

This  proves  (a).  To  prove  (b),  we  first  show  that  Bu°  is  in  the  normal  cone  to  dom  L*  at 
a0  - -ir°.  For  any  <r  € dom  L,  we  have  <rB  < d by  hypothesis.  Thus,  for  «°  > 0 we  have 
(d  - <rB)  u°  > 0.  Furthermore,  for  <r°  - -ir°  we  have  (d  - <r°B)  u°  - 0 by  the  complemen- 
tary slackness  condition  in  (LD').  Subtraction  gives 

-<rBu°  + (t°Bu°  ^ 0 or  Bu°(a°-cr)  > 0,  <r  € dom  L *, 

which  says  that  Bu°  is  in  the  normal  cone  to  dom  L*  at  <r°.  Next  we  show  that  if 
y°  6 9L*(<r°)  and  if  y is  in  the  normal  cone  to  dom  L*  at  <r°,  then  >°  4-  y € dL*(a°).  By 


8 


U.  S.  KA.RMARKAR 


hypothesis  we  have  y(<r-cr°)  < 0 for  any  a-  € dom  L*.  Furthermore,  L*(cr)  > L*(a-°)  + 
y°(<r-a°)  for  any  c r € dom  L*.  Adding  these  gives  L*(<r)  > L*(<r°)  + (y°+7)  (<r- o-0), 
«r  € dom  L*.  which  says  that  (y°  + jO  € dLm(cr°).  Now,  to  prove  (b),  we  have  that 
y°  € 8 L*(<r°)  by  hypothesis  and  that  Bu°  is  in  the  normal  cone  to  dom  L*  at  <r°.  Using  the 
result  shown  above,  we  have  y°  + Bu°  € dL*(<r°),  which  proves  (b).  I I 

PARAMETRIC  ANALYSIS 

Two  topics  of  interest  in  a parametric  analysis  of  the  problem  are  discussed  here.  One  is 
the  form  of  the  optimal  policy  in  the  sense  of  the  characteristics  of  the  optimal  ( ir,y ) pairs  for  a 
given  initial  state  vector  x.  The  second  topic  is  the  study  of  the  parametric  cost  function  v(x). 
A third  topic  that  is  not  pursued  here  but  is  also  of  great  interest  is  the  sensitivity  of  the  cost 
function  v(x)  to  changes  in  the  cost  parameters  of  the  problem.  This  topic  is  more  problem- 
specific  and  is  better  discussed  in  the  context  of  some  specific  numerical  algorithm. 

The  motivation  for  studying  the  form  of  the  optimal  policy  arises  from  the  observation 
that  such  qualitative  results  have  typically  been  useful  in  inventory  theory  as  well  as  other 
fields.  Well  known  examples  are  the  "order-up-to"  and  "(s,S)"  forms  of  policies  that  arise  in  the 
single-location  inventory  problem  and  the  "bang-bang*  decision  rule  of  control  theory.  The 
characteristics  of  the  function  v(x)  are  useful,  since  the  function  relates  the  costs  incurred  to 
the  initial  state  x.  The  properties  of  v(x)  are  also  important  in  the  study  of  the  multiperiod 
and  infinite  horizon  problems. 

In  the  following  we  shall  denote  by  Aj,  the  "normal  cone"  to  dom  C*at  ir'.  That  is  to  say, 
A,  is  the  set  of  vectors  y such  that  (v-ir‘)y  < 0 for  any  ir  € dom  C*.  It  follows  from  this 
that,  for  any  y € K’w,  try  is  maximized  over  dom  C*  at  ir',  since  rearranging  the  inequality 
above  gives  ir'y  > iry,  vn  € dom  C*.  Since  C*is  the  indicator  function  of  a (polyhedral)  con- 
vex set,  it  is  also  known  that  K'w  is  the  same  as  8 C*(irl).  We  note  that  for  any  ir  € dom  C* 
we  always  have  0 € AT,,  i.e.  0 € 8C*(ir). 

the  following  proposition  shows  how  optimal  (ir°,  y°)  pairs  can  be  constructed  and 
identifies  the  set  of  initial  state  vectors  x for  which  the  pairs  are  optimal. 

PROPOSITION  4:  Suppose  that  n°  is  dual  feasible  and  L*  is  subdiflerentiable  at  -ir°. 
Let  y°  € dL*(-ir°).  Then  (ir°,y<l)  is  optimal  in  (CP)  — (CD)  for  all  x such  that 
y°  - x 6 A®. 

PROOF:  (it0,  y°)  satisfies  the  optimality  conditions  for  (CP)  — (CD).  I I 
Looking  at  the  dual  problem  (CD)  or  (UCD)  more  closely,  we  can  distinguish  some  special 
cases: 


(i)  If  ir®  € int  dom  C*and  y°  € 9L*(-ir°),  then  (ir®,  y°)  is  optimal  in  (CP)  — (CD) 
only  for  x - y°.  This  follows  since  K°  in  this  case  consists  only  of  the  null  vector. 

(ii)  If  n°  is  dual  feasible  and  y°  € 9L*(-ir°),  (n°,  y°)  is  always  optimal  for  x - y°, 
since  we  always  have  0 € A® 

(iii)  If  -ir®  € int  dom  L*  it  is  well  known  that  the  set  of  y°  € dL*(-ir°)  is  convex  and 
bounded. 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY  9 

Civ)  If  — ir°  int  dom  Z.*,  the  set  of  y°  6 dL*(-w0)  is  unbounded.  In  particular,  follow- 
ing the  discussion  in  the  proof  oi  Proposition  3,  if  y°  € 9L*(-ir°)  and  if  y is  in  the  normal 
cone  to  dom  L*  at  -ir°,  then  y°  + y € dL*i-n°),  and  clearly  y°  + Ay  6 dL*(-ir°)  for  X > 
0. 

(v)  If  Z. * is  not  subdifferentiable  at  -n°,  n°  cannot  be  optimal  in  the  dual  problem  for 
any  x (by  definition)  when  the  interior  point  condition  is  assumed  to  hold. 

(vi)  Let  S - {y|y  € dL*i-n),  -n  € dom  L*n(-dom  C*)}.  This  is  the  set  of  all  tar- 
get vectors  y generated  by  applying  the  subdifferential  mapping  dL*i-n)  to  dual  feasible  n.  If 
in0,  y°)  is  optimal  in  (CP)  — (CD)  for  some  x,  clearly  y°  € S.  Furthermore,  from  comment 
(ii)  above,  if  x - y°  the  optimal  policy  is  to  stay  put  at  y°.  Thus  we  may  think  of  S as  the  set 
of  "static”  target  states. 

PROPOSITION  5:  If  w,  and  ir2  are  dual  feasible  in  (CD),  y i € dL*i- iT|),  and 
y 2 € &L*i-n2),  we  cannot  have  y\  - y2  € K in  other  words,  y2  cannot  lie  in  the  set  of 
points  xfor  which  inuyt)  is  optimal  in  (CP)  — (CD).  Similarly,  y2  - y\  <?  K\. 

PROOF:  We  have  in  general  that  if,  for  some  convex  function  /,  <t\  € dfix{)  and 
<r2  6 3 /(x2),  then 

fix 2)  > fix |)  + <r 2ix2  - x,) 
and 

/(X,)  > fix j)  + <7j(X|  - Xj). 

Adding  and  simplifying,  we  get  that 

(<r,  - <r2)  ix2  - x,)  < 0. 

For  the  case  above,  we  have  (>»|  - y2)  i-w2  + »|)  < 0 or  (y,  - y2)  iir2  - ir,)  >0. 

Now  if  y,  - y2  € K\  then,  by  the  definition  of  the  normal  cone  Kl, 
iy i - y2)  in  - ir,)  < 0 for  any  n € dom  C*  and  hence  for  any  n which  is  dual  feasible.  In 
particular,  n2  is  dual  feasible,  which  gives  (y^  - y2)  (ir2  - irt)  <0,  leading  to  a contradiction. 
I I 

From  an  intuitive  point  of  view,  the  optimal  policy  can  be  thought  of  as  a set  of  optimal 
or  "target"  state  vectors,  such  that  if  the  initial  state  is  in  the  set,  then  no  action  is  taken.  This 
set  of  target  vectors  is  generated  from  the  dual  feasible  region  via  the  subdifferential  of  L*. 
Associated  with  each  target  vector  y°  is  a set  of  stocks  x for  which  y°  is  optimal,  and  this  set  of 
stocks  is  defined  by  the  subdifferential  of  C*.  From  Proposition  S we  know  that  if  a target  vec- 
tor y°  is  in  the  interior  of  the  "static"  set  S,  then  it  can  only  be  optimal  for  x - y°.  If  ir°  is  not 
in  int  (dom  C0)  then  y°  is  optimal  for  all  x such  that  y°  - x € K°,  and  the  optimal  act  is  to 
move  to  state  y°.  This  is  achieved  through  the  set  of  activities  that  are  optimal  in  LP(y°  - x) 
and  the  marginal  cost  of  this  transition  is  given  by  n°.  Examples  of  this  form  of  policy  for  dis- 
tribution problems  are  given  by  Karmarkar  and  Patel  [14]  and  Karmarkar  [11]. 

We  next  examine  the  nature  of  the  optimal  cost  function  v(x): 

PROPOSITION  6: 

(i)  v(-)  is  a closed,  proper,  convex  function. 

(ii)  dom  v <■  [x|x  — y - k,  y € dom  L,  k 6 “dom  Cl 


10 


U.  S.  KARMARKAR 


(iii)  v(x)  < L( x)  for  all  x. 

(iv)  v(x)  - L(x)  for  x € S 

(v)  v*(<r)  - L*(<r)  + C*(—<r) 

(vi)  dom  v*  - dom  L*  n (-dom  C *) 

PROOF:  From  (CD)  and  (UCD)  we  see  that  v(x)  - d(x)  can  be  regarded  as  the  conju- 
gate of  L * restricted  to  (-dom  C*).  Assertions  (i),  (v),  and  (vi)  follow  from  this  observation. 
Part  (iii)  follows,  since  z — 0,  y — x is  always  a feasible  solution  to  (CP)  and  hence 
v(x)  < L(x).  For  part  (ii),  if  x — y — k,  y € dotn  L,  k € dom  C then  ( y — x)  € dom  C and 
v(x)  < Ciy  - x)  + L(y)  < + °°  from  (UCP).  If  on  the  other  hand  y - x 4.  dom  C for  all 
y € dom  L then  the  optimal  value  in  (UCP)  must  be  4-  Assertion  (iv)  follows  from  the 
comments  (ii)  and  (vi)  following  Proposition  4.1  I 

PROPOSITION  7:  Let  ( tt° , y°)  be  optimal  in  (CP)  - (C'D)  for  initial  state  x°. 

1 (i)  v(x°)  — tt0^0  — x°)  + L(y°)  — —L*(.—n°)  — ir°x°, 

(ii)  *>(x)  > v(x°)  - ir°(x  - x°)  for  all  x, 

(iii)  If  y°  - x € then  v(x)  - v(x°)  - ir°(x  - x°), 
and  (ir°,  y°  is  optimal  in  CP(x),  CZ)(x). 

PROOF:  Part  (i)  is  clear.  From  (CD)  and  (UCD)  we  see  that,  since  L*  and  C*  are 
closed.  ~rr°  € dv(x°)  and  hence  (ii)  follows.  For  part  (iii)  we  note  that  (ir°,  y°)  satisfy  the 
optimality  conditions  for  (CP)  and  (CD)  for  initial  state  x,  and  the  rest  follows  using  part  (i).l  I 

The  following  proposition  shows  that  v(x)  can  be  majorized  by  functions  that  are  piece- 
wise  linear  over  their  domains. 

PROPOSITION  8: 

v(x)  < L(y)  + C'(y-x)  < L(y)  + C{y  - x),  A y € R". 


PROOF:  The  first  inequality  is  clear  from  Proposition  3 (i)  and  the  statement  of  (UCP'). 
The  second  follows  from  the  observation  that  (LP')  is  a relaxation  of  (LP).  I I 

If  y € S,  the  "static  set"  of  states,  then  equality  holds  at  x - y,  since  v(y)  - L(y)  and 
C{0)  — C(0)  — 0.  Furthermore,  C'(  ) and  C(  ) are  positively  homogeneous  functions  of 
their  arguments  and  are  piecewise  linear  functions,  since  they  represent  parametric  linear  pro- 
grams. 

Finally,  let  K°(x°)  be  the  set  of  optimal  vectors  y for  initial  state  x°.  Since 
-ir°  6 8v(x°),  we  can  write  T°(x0)  - -dL*UMx0)]. 

AN  EXAMPLE 

Consider  the  one-dimensional  case  with  quadratic  loss  function  L(y)  - k(y  -y*)2  for 
some  given  k,  y*.  Suppose  that  the  available  activities  allow  either  the  increase  or  disposal  of 
"stock"  from  the  initial  state  x.  The  problem  may  be  written  as 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


11 


v(x)  — inf  c+z+  + c~z~  + L(y) 
s.t.  z+  — z~  - y - x, 
z+.  z~  > 0. 


We  have 

C(y  - x)  - min  c+z+  + c~z~ 

s.t.  z+  — z-  - y - x, 
z\  z~  > 0, 

- max  ir(y  - x ) 
s.t.  —e~  < ir  <c+. 


Thus  we  can  explicitly  exhibit  C(-)  at  the  piecewise  linear,  convex  function  given  by 


C(x) 


c+x,  x < 0; 
— c~~x,  X < 0. 


The  convex  subproblem  is 


-L'(-ir) 


infjiry  + L{y) 

y 


The  optimal  solution  to  this  subproblem  is  y - y*  - and  hence  -L*(-ir) 
The  conjugate  function  of  C is  simply  given  by 


C*(ir) 


0 if  -c  < it  < c+, 
+oo  otherwise. 


iry*  — ir2/4k. 


Hence  the  dual  problem  can  be  written  as 
d(x)  - sup(iry*  - i^/rk)  - irx 

ir 

S.t.  - C~  < IT  < C+. 

If  we  assume  that  -c~  < c+  (dual  interior  point  condition)  then  v(x)  - </(x),  and  the 
infimum  in  the  primal  problem  is  attained.  Since  the  primal  problem  is  always  feasible,  the 
infimum  is  finite. 


We  can  now  generate  the  optimal  policy  for  all  initial  states  x by  using  the  characteriza- 
tions in  Propositions  3 and  S.  The  set  S can  be  thought  of  as  the  set  of  y such  that  there  is 
some  dual  feasible  ir  such  that  -ir  € BL  (y).  Once  this  set  is  generated,  a point  in  the  set  is 
optimal  for  all  x such  that  y - x € Kw.  The  cones  Kw  associated  with  the  dual  are  the  positive 
half  line  at  it  - c+  and  the  negative  half  line  at  ir  - —c~.  Without  going  into  further  detail,  we 
can  sketch  v(x)  and  the  set  S in  Figure  1.  The  set  S lies  between  y+  and  y~,  where  y*  is  the 
point  at  which  BHy+ ) is  -c*,  and  similarly  c~  € BL(y~).  The  optimal  policy  is  to  do  nothing 
if  x € S,  to  "order"  0»+  - x)  if  x < y*  and  to  "dispose  of"  (x  - y~),  if  x > y~.  This  is  essen- 
tially the  familiar  "order-up-to"  model  of  inventory  theory  but  with  a different  loss  structure.  It 
may  also  be  thought  of  as  a one-dimensional  goal-programming  model  with  quadratic  loss. 


12 


U.  S.  KARMARKAR 


FIGURE  I.  Optimal  policy  and  value  function 
(One  variable,  quadratic  lots  example) 


A SUBSTITUTION  PROPERTY 

Williams  in  [27]  makes  a comment  to  the  effect  that  a substitution  result  similar  to  that 
for  deterministic  Leontieff  models  also  holds  for  stochastic  programs.  However,  the  nature  of 
this  property  is  not  clearly  specified.  A general  version  of  such  a property  for  the  model  stu- 
died here  is  stated  below.  It  follows  as  a consequence  of  previous  results  on  the  form  of  the 
policy.  A particularization  of  the  result  for  the  Leontieff  case  is  then  described  in  a way  that 
makes  the  connection  with  the  so-called  Substitution  Theorem  clear. 

PROPOSITION  9:  Suppose  that  (ir°,y0)  is  optimal  in  (CP)  — (CD)  for  initial  state  x° 
and  that  B°  is  the  unique  optimal  basis  in  LP(y°  - x°).  Then  B°  is  also  an  optimal  basis  for 
any  initial  state  xsuch  that  0>°  - x)  € A®. 

PROOF:  This  is  a restatement  of  Proposition  7 (iii)  with  the  added  requirement  that 
LD(y°  - x°)  is  not  degenerate  at  w0.  This  ensures  that  the  same  basis  B always  corresponds  to 

w®.  II 

Note  that  if  K (B)  denotes  the  cone  generated  by  positive  linear  combinations  of  the  columns 
of  B,  then  in  general  K(B)  C Af®.  Furthermore,  we  can  drop  the  uniqueness  requirement  in 
the  theorem  above  if  we  replace  K ® by  K ( B ). 

The  specialization  of  this  result  to  the  Leontieff  case  is  of  interest  since  in  uncapacitated 
multilocation  problems,  the  matrix  A generally  has  a Leontieff  structure. 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


DEFINITION:  A matrix  A is  said  to  be  Leontieff  if 


(i)  Every  column  of  A has  exactly  one  positive  entry. 

(ii)  There  is  some  z > 0 such  that  Az  > 0. 


Consider  the  problem  LP(b) 

min  cz 
s.t.  Az  — b 

z > 0. 

The  well-known  Substitution  Theorem  (Samuelson  [21])  is  stated  here  without  proof: 


PROPOSITION  10:  Suppose  A is  Leontieff,  and  that  B is  an  optimal  basis  in  LP(b°)  for 
some  b°  > 0.  Then  B is  optimal  in  LP(b)  for  anv  b ^ 0 


Let  0+  denote  the  nonnegative  orthant,  i.e.,  0+  - |A|b  > 0).  Furthermore,  let  ir° 
correspond  to  the  basis  B , i.e.,  ir°  — cbB~1.  We  then  have  the  corollaries: 


PROPOSITION  1 1 


PROPOSITION  12:  If  A is  Leontieff  there  is  a unique  dual  vector  ir°  optimal  for  all  b > 


PROPOSITION  13:  If.ir  is  a feasible  dual  vector  in  LP(b)  then  ir  < w° 
feasibility  does  not  depend  on  0.) 


There  has  been  considerable  interest  recently  in  the  properties  of  constraint  sets  in  Leon- 
tieff and  ’Hidden  Leontieff"  systems.  Some  relevant  references  in  this  connection  are  Cottle  Sc 
Veinott  [5],  Saigal  [20],  Yu  [29],  and  Gabbay  [9].  The  implications  for  stochastic  programs  and 
distribution  problems  have  been  discussed  by  Karmarkar  [11]. 


PROPOSITION  14:  Suppose  that  A is  Leontieff  and  that  n°  is  optimal  in  LP(b)  for  all  b 
> 0.  Furthermore,  suppose  that  L*  is  subdifferentiable  at  — ir°  and  let  y°  € dL*(-w°).  Then 
y°  is  optimal  in  CP  for  all  x < y°. 

PROOF:  (n°,  y°)  satisfy  the  optimality  conditions  for  (CP)  — (CD).  I I 


In  other  words,  under  suitable  assumptions  there  is  a base  stock  vector  y°  such  that  when- 
ever stocks  are  low  enough  the  optimal  policy  is  to  raise  them  to  l\e  base  stock  levels.  When 
the  positive  orthant  0+  is  strictly  contained  in  Af£,  the  set  of  string  stocks  for  which  y°  is 
optimal  may  be  even  larger,  as  is  shown  by  the  following  example. 

AN  INVENTOIY  EXAMPLE 


Consider  a two-location  system  where  only  location  1 can  order  exogenously  (Z,)  at  cost 
P\  per  unit.  Transshipment  z,t  costs  ct)  per  unit.  Starting  stocks  x,  are  available  at  each  location 
(possibly  negative).  If  after  ordering  and  transshipping  y,  is  the  stock  at  location  /,  then  /,(y,) 
represents  the  usual  costs  of  overstocking  and  understocking  after  demand  is  realized  (assumed 
convex).  The  problem  may  be  stated  as  follows. 


14 


U S.  KARMARKAR 


inf  P|Z,  + c12z12  + c21 2 21  + /,(>,)  + l2(y2) 
s.t.  Z,  - z,2  + z2i  -yx - xx, 

*12“  *21  - >2  - -*2. 

Z|,  z12,  22|,  ^ 0. 

Note  that  the  constraint  matrix  is  Leontieff. 

The  (LP)  subproblem  is 

C(y  ~ x)  - min  p,Z,  + cX2zX2  + c21  z21 
s.t.  Z,  - z)2  + z2,  -y,  -x,, 

*12  ~ *21  ” >2  ~ *2. 

Z|,  Z|2,  z2,  ^ 0. 

Its  dual  is 

C(y  - x)  - max  7r,(>>,  - x,)  + ir2(.y2  - x2) 
s.t.  ir,  < p,( 

-ir,  + ir2  < c12, 
it\  - ir2  < c2|. 

The  convex  subproblem  (CS)  decouples  into  two  subproblems: 

(CSJ  - inf  w,y,  + l,(yt). 

>! 

If  we  assume  c12  - c21  < p{  for  the  sake  of  exposition,  the  feasible  region  to  the  (LP)  dual  can 
be  sketched  as  in  Figure  2.  Assume  further  that  the  extreme  points  (/?, , px  + c)2 ) and 
(Pi.  Pi  - c2t)  are  dual  feasible.  Let  y,  represent  the  optimal  solution  to  CS^ir,),  i.e., 
y,°  € d(‘(-7r,).  Then  the  stock  levels  corresponding  to  these  dual  feasible  points  can  be  com- 
puted from  the  subproblems  (CS,)  as  in  Figure  3.  The  extreme  point  (px , px  + c,2)  in  Figure 
2 is  the  Leontieff  maximal  point  and  corresponds  to  the  base  stock  level  in  Figure  3 which 
shows  the  optimal  policy  for  all  starting  stock  conditions.  The  shape  of  the  static  region  S 
depends  on  the  functions  li(y),  and  the  cones  are  generated  as  normal  cones  to  the  (LP)  dual 
in  Figure  2.  A detailed  account  of  the  computations  for  such  small  problems  together  with 
several  examples  can  be  found  in  Karmarkar  [11]  and  Karmarkar  and  Patel  [14,  13]. 

SUMMARY  i 

This  paper  has  described  the  qualitative  analysis  of  a convex  programming  problem 
motivated  by  the  multilocation  distribution  problem  of  inventory  theory.  The  formulation  also 
arises  in  other  contexts  and  in  particular  subsumes  certain  'active”  stochastic  programming 
models.  A decomposition  approach  is  employed,  and  duality  and  optimality  conditions  for  the 
model  are  discussed.  In  a parametric  analysis  of  the  problem,  properties  of  the  optimal  value 
function  (perturbation  function)  are  described. 


16 


U.  S.  KARMARKAR 


In  the  spirit  of  inventory  theory  it  is  shown  that  the  optimal  policy  has  a certain  geometric 
form— it  can  be  characterized  as  a set  5 of  points  such  that  each  point  in  5 is  the  vertex  of  a 
convex  cone.  Points  of  5 are  generated  from  dual  feasible  points  via  a subdifferential  map,  and 
the  corresponding  cones  are  generated  as  the  normal  cones  to  the  dual  feasible  points.  The 
optimal  policy  is  to  do  nothing  if  the  initial  state  is  in  S,  and  to  move  to  the  vertex  of  the  cone 
in  which  the  initial  state  lies  when  it  is  outside  S. 

Finally,  a general  substitution  property  that  follows  from  the  analysis  is  described  and  the 
result  is  specialized  to  the  Leontielf  case  as  a base-stock  theorem.  Two  examples  are  described 
to  demonstrate  the  application  of  the  methods. 

The  analysis  of  the  properties  of  the  value  function  v is  basic  to  the  study  of  multiperiod 
and  infinite-horizon  problems.  These  are  studied  elsewhere  by  Karmarkar  [11,  13],  and  it  is 
shown  that  the  same  form  of  policy  obtains.  Computational  methods  for  the  one-period  prob- 
lem have  been  described  in  [11,  IS].  Some  work  is  underway  on  approximate  solutions  to  the 
multiperiod  problem. 

It  is  thought  that  the  analysis  in  this  paper  forms  a unifying  basis  for  studying  multiloca- 
tion distribution  problems.  The  results  on  the  form  of  the  optimal  policy  also  apply  to  stochas- 
tic programming  and  convex  goal-programming  models  and  to  cash-management  and 
investment-consumption  problems. 

ACKNOWLEDGMENTS 

Part  of  this  work  was  supported  by  ONR  Contract  N00014-67-A-0204-0076  on  multilevel 
logistics  organization  models.  The  research  was  completed  under  a summer  research  grant 
from  the  Graduate  School  of  Business  of  the  University  of  Chicago.  The  ideas  underlying  this 
paper  originated  in  earlier  work  with  Dr.  Nitin  Patel,  to  whom  the  author  is  indebted  for  many 
discussions  and  suggestions. 


APPENDIX 

CONVEX  ANALYSIS 

Let  /be  a function  from  R"  to  the  extended  reals,  Rl/{+°°,  -«>}.  Then  epi  /is  the  set 
of  points  (x,  p)  in  R"+1  such  that  p € R,  x € /{"and  p > /(x).  The  function  /is  convex  if 
and  only  if  (iff)  epi  / is  a convex  set  in  The  effective  domain  of  a function  / is  denoted 

dom  / and  is  the  set  of  points  x € R”  such  that  fix)  < +«.  The  function  / is  said  to  be 
closed  if  epi  / is  a closed  set  in  R"*1  and  / is  said  to  be  proper  if  epi  / contains  no  vertical 
lines.  We  may  always  consider  / to  be  defined  over  all  of  R"  by  taking  fix)  - +«>  for  all  x f 
dom  / Thus  we  can  regard  the  extended  function  as  fix)  + 8(x|dom  /)  where  the  indicator 
function  8 is  0 for  x € dom  / and  +«»  elsewhere. 

A subgradient  of  / at  some  point  x € dom  / is  a vector  a € R " such  that 
fiz)  > fix)  + <a,  z-x>,  z € R\  where  the  operation  <•,•>  denotes  the  inner  product  of 
two  vectors.  (We  write  try  to  mean  <<r,y>  whenever  there  is  no  ambiguity  about  the  nota- 
tion). The  accompanying  diagram  (Figure  Al)  shows  the  relation  between  subgradients  at  a 
point  xand  the  (nonvertical)  supporting  hyperplanes  to  epi  / at  [x,  /(x)]  € /?"+l.  Clearly,  if  a 
is  a subgradient  of  / at  x,  ia,  -1)  € R"  is  the  normal  vector  identifying  a supporting  hyper- 
plane to  epi  f at  [x,  fix)],  since  the  latter  condition  implies  that 

ax  - fix)  > az  - p,  for  (z,  p)  € epi  f, 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


17 


> <TZ  - f(l), 

which  is  the  same  as  the  definition  of  subgradient. 


In  general  there  may  be  more  than  one  subgradient  (or  supporting  hyperplane)  at  a given 
point,  but  if  the  function  is  differentiable,  then  the  subgradient  is  unique  and  equal  to  the  gra- 
dient of  the  function. 


The  set  of  all  subgradients  of  / at  x is  called  the  subdifferential  of  /at  x is  denoted  d/(x). 
The  subdifferential  of  / is  the  set-valued  mapping  Bf  x — * Bfix)  and  the  set  Bfix)  is  a closed 
convex  set  which  is  always  nonempty  and  bounded  for  any  x in  the  interior  of  dom  / A func- 
tion / is  said  to  be  subdifferentiable  at  a point  x if  Bfix)  is  nonempty.  As  an  example  of  a 
closed,  convex  function  that  is  not  subdifferentiable  (Rockefeller  [19]),  consider  f.R-R 


fix) 


-|(1~*2),/2I.  -1  < x < +1, 

+«>  elsewhere. 


Here  fix)  is  finite  (equal  to  zero)  at  x — +1  and  at  x — -1,  but  epi  f does  not  have  a 
nonvertical  supporting  hyperplane  at  these  points. 


The  interior  of  a set  A will  be  denoted  int  A and  is  the  set  of  points  x € A such  that  there 
is  an  open  set  N c A which  contains  x (there  is  an  open  neighborhood  of  x contained  in  A). 
The  relative  interior  of  A is  denoted  ri  A and  is  defined  as  the  interior  of  the  set  relative  to  aff 
A,  which  is  the  smallest  affine  set  containing  A,  i.e.,  x€ri  A if  x€\CA  and  N is  open  rela- 
tive to  aff  A.  An  affine  set,  it  will  be  recalled,  is  a set  of  vectors  such  that  if  x and  y are  in  the 
set,  then  all  points  z on  the  line  through  them  given  by  z - Ax  + (1  - A)y,  A € R,  are  also  in 
the  set.  An  affine  set  can  always  be  thought  of  as  a translated  subspace,  and  for  every  affine  set 
M there  is  a unique  subspace  5 and  m € R"  such  that 

M — [y\y  - x + m,  x € 5} 

and  furthermore 

S - |z|r  - x - y,  x.  y € M). 


18 


U.  S.  KARMARKAR 


The  closure  of  a set  A will  be  denoted  cl  A and  the  boundary  of  a set  A is  given  by 
(cl  /4)|(int  A)  - {jc | jc  € cl  A,  x 4 int  A).  Analogously,  the  relative  boundary  of  A is  (cl 
>4)|(ri  A). 

An  important  duality  correspondence  used  extensively  here  is  that  of  cot\jugacy  of  convex 
functions.  The  conjugate  /*of  a function  / is  defined  as 

/*(<r)  - supjrrx  - /(x) 

For  every  x,  crx  - f(x)  is  an  affine  functional  in  <r,  and  hence  f*(ar ) is  the  supremum  of  a 
collection  of  affine  functions  for  each  a and  is  thus  a convex  function.  It  can  be  shown  that  f* 
is  closed  and  that  (/•)  * - (cl)  (/),  where  (cl)  (/)  is  the  closed  function  whose  epigraph  is 
cliepi  /).  The  function  /*  can  also  be  thought  of  as  the  maximum  value  attained  by  the 
hyperplane  described  by  the  normal  (<r,  -1)  over  the  set  epi  / (see  Figure  Al). 


A support  function  is  defined  as  the  conjugate  function  of  an  indicator  function.  Thus, 
for  a set  A,  the  support  function  is  given  by 


8 ’A(<r)  - sup 


\<ry  ~ 8(yM)|  - sup{o\y}. 


REFERENCES 

11]  Aggarwal,  S.  C.,  "A  Review  of  Current  Inventory  Theory  and  Its  Applications,"  Interna- 
tional Journal  of  Production  Research,  12,  443-481  (1974). 

(2]  Allen,  S.  G.,  "Redistribution  of  Total  Stock  over  Several  User  Locations,"  Naval  Research 

Logistics  Quarterly,  5,  337-348  (1958). 

[3]  Balanchandran,  V.,  "The  Stochastic  Generalized  Transportation  Problem— An  Operator 

Theoretic  Approach,"  presented  at  the  ORSA/TIMS  Joint  National  Meeting,  Boston 
(1974). 

14]  Clark,  A.  J.,  "An  Informal  Survey  of  Multiechelon  Inventory  Theory,"  Naval  Research 
Logistics  Quarterly,  19,  621-650  (1972). 

[5]  Cottle,  R.  W.,  and  A.  F.  Veinott,  "Polyhedral  Sets  Having  a Least  Element,"  Mathematical 

Programming  3,  238-249  (1972). 

(6]  Dantzig,  G.  B.,  "Linear  Programming  Under  Uncertainty,"  Management  Science,  1 197- 

206  (1955). 

]7]  Das,  C.,  "Supply  and  Redistribution  Rules  for  Two  Location  Inventory  Systems:  One- 
Period  Analysis,”  Management  Science,  21,  765-776  (1975). 

[8]  Elmaghraby,  S.  E„  "Allocation  Under  Uncertainty  When  the  Demand  Has  a Continuous 

D.  F.,"  Management  Science  6,  270-294  (1960). 

[9]  Gabbay,  H.,  "A  Note  on  Polyhedral  Sets  Having  a Least  Element,”  submitted  for  publica- 

tion. 

[10]  Gross,  D.,  "Centralized  Inventory  Control  in  Multilocation  Supply  Systems,”  Chap.  3 in 

Multistage  Inventory  Models  and  Techniques,  H.  Scarf,  D.  Gilford  and  M.  Shelley  (eds.) 
(Stanford  University  Press,  Stanford,  California  1963). 

[11]  Karmarkar,  U.  S.,  "Multilocation  Distribution  Systems,"  unpublished  Ph.D.  Thesis,  Mas- 

sachusetts Institute  of  Technology,  Cambridge  (1975).  . 

[12]  Karmarkar,  U.  S.,  "The  Multilocation  Distribution  Problem,"  presented  at  the  ORSA/TIMS 

Joint  National  Meeting,  Chicago  (1975). 

[13]  Karmarkar,  U.  S.,  "Multiperiod  Multilocation  Inventory  Problems/  submitted  for  publica- 

tion. 


CONVEX/STOCHASTIC  PROGRAMMING  AND  MULTILOCATION  INVENTORY 


19 


[14]  Karmarkar,  U.  S.,  and  N.  Patel,  "The  N-Location  Distribution  Problem,''  presented  at  the 

ORSA/TIMS  Joint  National  Meeting,  Boston  (1974). 

[15]  Karmarkar,  U.  S.,  and  N.  Patel,  "The  One-Period,  N-Location  Distribution  Problem," 

Naval  Research  Logistics  Quarterly,  24,  559-575  (1977). 

[16]  Krishnan,  K.  S.,  and  V.  R.  K.  Rao,  "Inventory  Control  in  N Warehouses,"  Journal  of 

Industrial  Engineering,  16,  212-215  (1965). 

[17]  Magnanti,  T.  M.,  "Fenchel  and  Lagrange  Duality  Are  Equivalent,"  Mathematical  Program- 

ming, 7,  253-258  (1974). 

[18]  Parikh,  S.  C.,  "Equivalent  Stochastic  Linear  Programs,"  Journal  on  Applied  Mathematics, 

18,  1-5  (1970). 

[19]  Rockafellar,  R.  T.,  Convex  Analysis  (Princeton  University  Press,  Princeton,  New  Jersey 

1970). 

[20]  Saigal,  R.,  "On  a Generalization  of  Leontieff  Systems,"  Working  Paper,  University  of  Cali- 

fornia, Berkeley,  California  (1970). 

[21]  Samuelson,  P.  A.,  "Abstract  of  a Theorem  Concerning  Substitutability  in  Open  Leontieff 

Systems,"  in  Activity  Analysis  of  Production  and  Allocation,  T.  C.  Koopmans  (ed.)  (Wiley, 
New  York  1951). 

[22]  Skeith,  R.  W.,  "A  Multilocation  Inventory  Model,"  Journal  of  Industrial  Engineering,  29, 

630-633  (1968). 

[23]  Walkup,  R.  W.,  and  R.  J.  B.  Wets,  "Stochastic  Programs  with  Recourse,”  SIAM  Journal  on 

Applied  Mathematics,  15,  1299-1314  (1967). 

[24]  Wets,  R.  J.  B.,  "Programming  Under  Uncertainty.  The  Equivalent  Convex  Program," 

SIAM  Journal  o?i  Applied  Mathematics,  14,  89-105  (1966). 

[25]  Wets,  R.  J.  B.,  "Programming  Under  Uncertainty:  The  Solution  Set,"  SIAM  Journal  on 

Applied  Mathematics,  14,  1143-1151  (1966). 

[26]  Williams,  A.  C.,  "A  Stochastic  Transportation  Problem,"  Operations  Research,  11,  759-770 

(1963). 

[27]  Williams,  A.  C.,  "On  Stochastic  Linear  Programming,”  SIAM  Journal  on  Applied 

Mathematics,  13,  927-940  (1965). 

[28]  Williams,  A.  C.,  "Approximation  Formulas  for  Stochastic  Linear  Programming,”  SIAM 

Journal  of  Applied  Mathematics,  14,  668-677  (1966). 

[29]  Yu,  P.  L.,  "Cone  Convexity,  Cone  Extreme  Points  and  Non-Dominated  Solutions  in  Deci- 

sion Problems  with  Multiobjectives,"  Journal  of  Optimization  Theory  and  Applications 
14,  319-377  (1974). 


V 


«TT 


fs 


SURVEY  OF  APPROACHES  TO  READINESS 


Zeev  Barzily,  W.H.  Marlow,  and  S.  Zacks 

Program  in  Logistics 
The  George  Washington  University 
Washington,  D.C. 

ABSTRACT 

About  thirty  references  that  feature  naval  logistics  environments  are  con- 
sidered. All  are  unclassified  and  all  appear  in  the  open  literature  or  are  avail- 
able from  the  Defense  Logistics  Studies  Information  Exchange.  Three  ap- 
proaches are  identified— data  analysis,  theoretical  models,  and  readiness 
indexes— and  conclusions  are  presented  as  to  possibilities  for  answering  two 
questions:  (a)  Can  the  unit  do  the  job?  (b)  How  does  readiness  depend  on 
resources?  Pour  cases  are  treated  in  detail  to  illustrate  methodology. 


INTRODUCTION  AND  SUMMARY 

There  has  doubtless  always  been  interest  in  assessing  readiness  of  military  units  to  carry 


out  particular  tasks.  Time  frames  have  consisted  of  immediate  instants  or  extended  intervals, 
methods  have  ranged  from  personal  judgments  to  sophisticated  calculations,  actual  assessments 
have  varied  from  merely  "yes"  or  "no"  to  indexes  and  complicated  probability  statements,  and 
so  on,  but  the  primary  question  has  always  been:  (a)  Can  the  unit  do  the  job?  And  given  the 
assessment,  the  question  for  logistics  has  been:  (b)  How  does  readiness  depend  on  resources? 
For  example,  in  Part  I of  the  Logistics  Research  Conference  proceedings  [21]  each  of  the  four 
senior  service  representatives  referred  to  these  questions  in  addressing  m<uor  issues  and  prob- 
lems in  logistics. 


» 


In  the  present  paper  we  provide  a brief  survey  of  several  approaches  to  answers  for  (a)  or 
(b).  We  have  found  that  we  can,  without  disadvantage,  restrict  our  attention  to  unclassified 
research  reports  that  feature  naval  logistics  environments  and  appear  in  the  open  literature  or 
are  available  from  the  Defense  Logistics  Studies  Information  Exchange  (DLSIE).  We  start  in 
Section  1 with  a concise  general  review  of  the  contribution  and  status  of  about  30  references; 
we  divide  these  references  into  three  convenient  classifications: 

(1)  Data  Analysis 

(2)  Theoretical  Models, 

(3)  Readiness  Indexes. 

In  Section  2 we  present  a more  detailed  discussion  of  four  important  cases  which  are  reviewed 
in  Section  1.  Our  general  conclusions  are  the  following. 


21 


1 


i 


l 

1MIMS—  ■ I 


; * 


22 


Z.  BARZILY,  W.  H.  MARLOW  AND  S.  ZACKS 


First,  by  far  the  most  promising  approach  to  obtaining  practical  answers  to  questions  (a) 
and  (b)  appears  to  be  represented  by  the  methodology  study  of  the  Institute  of  Naval  Studies 
[20]  conducted  for  the  Navy  Readiness  Analysis  System;  it  could  be  extended  by  inclusion  of 
further  cluster  analysis  techniques  such  as  those  reported  by  Solomon  [28]  and  pattern  recogni- 
tion procedures.  In  specific  cases,  the  generation  of  special  data—  as  typified  by  the  Polaris 
Military  Essentiality  System  [4]  — or  the  straightforward  use  of  existing  authoritative  data  — 
as  in  the  Logistics  Research  Project  report  [9]  — seems  worthwhile. 

Second,  theoretical  models,  such  as  represented  by  Gaver  and  Mazumdar  [11],  Kaplan 
[16],  and  others  noted  below,  should  be  continued  to  be  studied  in  connection  with  particular 
problems  in  which  readiness  can  be  involved.  They  are  not  to  be  regarded  as  immediate 
sources  of  operational  answers  to  questions  (a)  or  (b),  but  their  study  should  produce  results 
that  will  be  helpful  in  devising  practical  procedures. 

Third,  we  have  found  no  evidence  to  indicate  that  hierarchical  models  involving  the  calcu- 
lation of  a readiness  index  of  a system  based  on  the  readiness  indexes  of  its  components  (as  in 
METRI  [3]  and  MARIS  [10])  are  promising  for  answering  questions  (a)  or  (b).  In  the  first 
place,  requirements  for  data  (especially  functional  representations)  are  overwhelming.  In  the 
second,  it  is  doubtful  that  the  hierarchies  could  be  used  as  hoped  for,  even  if  data  were  avail- 
able. 


In  terms  of  what  we  have  been  able  to  survey,  it  appears  that  there  is  no  body  of  data  that 
can  be  used,  together  with  existing  methodology,  to  answer  question  (a). 

1.  GENERAL  REVIEW 

There  are  a number  of  ways  in  which  we  can  classify  different  approaches  that  have  been 
taken  to  readiness.  For  example,  some  efforts  fall  under  the  heading  operational  readiness , 
where  attention  is  mainly  focused  on  operations  that  the  military  unit  is  required  to  perform. 
Such  efforts  often  seek  results  somewhat  like  sufficient  conditions  in  mathematics:  given  certain 
evidence,  say  from  training  exercises,  a result  might  be  a prediction  that  the  unit  will  be  able  to 
accomplish  a particular  operation.  Other  efforts  fall  under  the  heading  material  readiness,  where 
attention  is  mainly  focused  on  physical  objects.  Here  results  are  often  sought  that  are  some- 
what like  necessary  conditions  in  mathematics:  given  certain  evidence,  say  from  inspections,  a 
result  might  be  a prediction  that  the  unit  definitely  cannot  accomplish  a particular  operation. 
Other  terms  appear  in  the  literature,  for  example,  combat  readiness  and  industrial  readiness, 
but  instead  of  classifying  approaches  in  this  way  we  use  (1),  (2),  and  (3)  displayed  in  the 
preceding  section. 

1.1  Data  Analysis 

All  approaches  to  readiness  that  we  consider  involve  some  kind  of  analysis  of  data  but 
here  under  classification  (1)  we  collect  those  that  depend  almost  entirely  thereon.  Here  the 
data  evidently  have  logical  connections  with  what  is  needed  to  carry  out  particular  military 
tasks.  The  issues  that  distinguish  different  approaches  are  mainly  two: 

How  pertinent  are  the  data? 

How  defensible  are  the  analyses? 

Let  us  proceed  to  specific  examples. 


SURVEY  OF  APPROACHES  TO  READINESS 


23 


Early  efforts  on  military  worth— synonymously,  military  essentiality— at  the  Logistics 
Research  Project  illustrate  the  first  approach  where  special  methods  produce  special  applicable 
data.  These  efforts  were  mainly  directed  at  the  ships  allowance  list  problem,  which  is  the  problem 
of  determining  the  list  of  quantities  of  "repair  parts"  carried  on  board  a ship  in  direct  support  of 
the  installed  "equipments."  Readiness  entered  explicitly  through  coverage  of  question  (a)  by 
questionnaires  for  the  determination  of  specific  consequences  on  the  ship’s  mission  following 
need  for  the  part,  when  no  spare  was  available.  In  the  most  serious  case  for  question  (b),  the 
task— say  the  patrol  of  a submarine— would  have  to  be  terminated.  The  scheme  for  submarines 
given  by  Denicoff,  Fennell,  and  Solomon  [5]  is  modified  by  Denicoff  et  al.  [4]  for  the  Polaris 
weapons  system  and  Denicoff,  Haber,  and  Varley  [6]  apply  the  method  to  naval  aviation.  Par- 
ticularly, the  Polaris  scheme  of  military  essentiality  classes  has  had  long  use  by  the  Navy,  both  as 
a source  of  descriptors—  highest  worth,  high  worth,  and  so  on  — and  for  providing  "readiness 
data"  inputs  for  procedures  and  models  connected  with  inventory  problems. 

The  Navy  produces  various  kinds  of  status  reports  (for  example,  Force  Status  and  Identity 
Reports,  Ready  Material  Condition  data),  many  of  which  include  readiness  grades  or  "C-ratings" 
such  as 

C-l  Fully  ready, 

C-2  Substantially  ready, 

C-3  Marginally  ready, 

C-4  Not  ready 

Grades  are  assigned  by  the  individuals  responsible  for  the  military  tasks,  or  in  some  cases  for 
the  "equipments,"  in  question.  Answers  to  question  (a)  are  directly  given  in  this  fashion  and, 
in  cases  where  "resources”  determine  C-ratings,  responses  are  also  made  to  question  (b)  The 
Logistics  Research  Project  report  [9]  describes  a method  of  analyzing  and  using  such  data  for  a 
fleet  of  destroyers,  as  follows.  For  one  ship  there  are  C-ratings  for  eleven  subresources  cover- 
ing personnel,  supply,  equipment,  and  (average)  training.  A single  C-rating  is  deduced  for  each 
ship  using  a "weakest  link  approach”  and  then  average  C-ratings  are  obtained  for  groups  of 
ships.  Measures  are  also  obtained  for  individual  subresources  in  ways  that  are  responsive  to 
question  (b);  specifically,  the  difficulty  of  improving  readiness  (by  improving  particular 
subresources)  is  addressed  and  the  mqjor  problem  areas  are  identified.  In  summary,  we  can  say 
that  in  (91  the  data  are  by  design  pertinent  and  the  analyses  are  intentionally  unsophisticated. 

The  U.  S.  Navy  Board  of  Inspection  and  Survey  has  long  been  a source  of  data  on  the 
material  condition  of  ships  and  their  readiness.  Segel  [27]  describes  origins  of  a uniform 
analytical  inspection  methodology  that  was  in  use  for  a long  period  and  Solomon  [281  reports 
results  from  cluster  analyses  on  such  data.  McVoy  (231  is  a source  of  considerable  information 
on  different  approaches  to  readiness  based  on  physical  condition.  He  also  furnishes  a substan- 
tial list  of  references.  Data  on  repairs,  modifications,  and  overhauls  to  ships  similarly  offer 
promise  of  helpful  conclusions  on  the  physical  condition  of  ships;  for  example,  Hamilton  (13] 
did  an  early  study  of  effects  of  personnel,  material  supply,  availability,  obsolescense,  and 
deterioration  on  readiness. 

It  is  our  opinion  that  by  far  the  most  substantial  "data  analysis"  approach  to  readiness  is 
given  in  the  methodology  study  of  the  Institute  of  Naval  Studies  [20]  for  the  Navy  Readiness 
Analysis  System.  It  reports  on  work  during  the  second  half  of  the  1960’s,  when  the  Navy  was 
committed  under  high  priority  to  the  development  of  such  a system  for,  among  other  things, 
determining  how  changes  in  resources  and  environments  can  be  expected  to  affect  the  perfor- 
mance and  capabilities  of  Navy  units,  forces,  and  activities.  In  the  words  of  its  abstract,  it 


24 


Z.  BARZILY,  W.  H.  MARLOW  AND  S.  ZACKS 


describes  a method  developed  for  systematically  examining  the 
relationships  among  personnel,  training,  equipment,  and  supply 
resource  variables  and  destroyer  performance  measures.  Equa- 
tions for  evaluating  performance  readiness  of  Atlantic  Fleet 
destroyers  at  the  end  of  refresher  training  are  presented,  and 
recommendations  are  made  for  improving  performance  meas- 
urement and  resource  data  collection. 

A broad  range  of  statistical  procedures  is  applied  in  [20],  and  in  Section  2 of  the  present  paper 
we  provide  a short  discussion  of  the  methodology.  It  is  important  to  comment  that,  in  contrast 
with  the  approaches  reported  in  Section  1.3,  the  methodology  in  [20]  does  not  try  to  express 
the  readiness  of  a ship  by  one  index  number,  but  instead  provides  a vector  of  readiness  score 
factors  which  are  uncorrelated.  Each  score  factor  provides  additional  information  and  is  thus  an 
important  factor  of  readiness.  We  believe  that  anyone  who  is  interested  in  pursuing  readiness 
analysis  should  study  [20], 

1.2  Theoretical  Model* 

In  every  approach  to  readiness  that  we  have  found,  there  is  a model  of  some  kind,  and 
somewhere  there  is  theory,  but  here  we  collect  efforts  designed  to  provide  models  and  analyti- 
cal solutions  to  specific  problems  that  may  be  related  to  the  evaluation  of  readiness.  We  do  not 
include  theoretical  models  on  inventory,  maintenance,  replacement,  reliability,  and  so  on,  even 
though  such  matters  afTect  important  aspects  of  readiness  evaluation.  We  include  only  models 
directly  motivated  by  the  problem  of  assessing  readiness. 

One  of  us  has  surveyed  in  the  Logistics  Research  Conference  proceedings  [31]  the  prob- 
lem of  measuring  and  making  a statistical  inference  on  operational  readiness.  The  papers  that 
are  surveyed—  Gaver  and  Mazumdar  [11],  Mazumdar  [22],  and  Zacks  [30]—  consider  the 
model  of  a two-state  Markov  chain  ("up"  and  "down”)  and  the  problems  are  to  estimate  the  pro- 
bability of  readiness  in  particular  ways.  Another  example  is  Tolins  [29],  who  works  with  a time 
series  of  readiness  grades  for  an  individual  ship  as  a continuous— time  Markov  chain  having 
stationary  transition  probabilities,  and  then  deals  with  groups  of  ships. 

Several  papers  on  readiness  and  related  areas  were  prepared  at  New  York  University  dur- 
ing 1972-73.  Barish  and  Ehrenfeld  [1],  and  Kaplan  [14,  16,  18]  are  concerned  with  measure- 
ment of  readiness  by  a production  Junction  or  utility  Junction,  similar  to  those  used  in  economics. 
These  functions  constitute  readiness  data,  but  they  must  describe  level  of  performance  (output) 
in  terms  of  available  resources  (inputs)  and,  as  such,  valid  ones  are  difficult  to  obtain,  certainly 
as  compared  with  any  data  that  we  have  considered  above.  Greenberg  [12]  presents  two  tech- 
niques for  measuring  readiness.  Effects  of  transportation  are  studied  by  Kaplan  in  [IS]  while  in 
[17]  and  [19]  he  considers  replacement  problems. 

1.3  Readiness  Indexes 

Under  the  present  classification  we  collect  studies  that  depend  in  essential  ways  on  meas- 
urements, say  on  a scale  from  zero  to  unity,  that  can  be  called  readiness  indexes.  In  essence, 
such  indexes  represent  values  of  functional  relationships  of  the  kind  described  above  as  produc- 
tion functions  and,  again,  valid  examples  are  difficult  to  find.  The  main  idea  here  is  to  suppose 
that  there  is  an  index  for  each  part  of  a large  (typically  hierarchical)  complex  system  and  that 
(by  aggregation)  the  readiness  of  the  system  is  determined  by  the  value  of  a single  index  or  by 
at  most  a few  of  them.  In  other  words,  the  present  approach  involves  indexes  that  might  per- 
form analogously  to  "gross  national  product”  in  economics  or  "intelligence  quotient*  in  psychol- 
ogy or  education. 


SURVEY  OF  APPROACHES  TO  READINESS 


25 


The  METRI  Project  was  sponsored  by  the  U.  S.  Navy  during  the  early  1960s.  Its  objec- 
tive was  to  develop  a system  using  readiness  indexes  ’ measure  military  essentiality  of  repair 
parts  for  (destroyer)  allowance  lists.  A ship  was  represented  as  a hierarchical  structure  proceed- 
ing downward  through  missions,  functional  subsystems,  and  components.  The  actual  hierarchy 
was  to  be  constructed  using  four  basic  structures,  series,  supplements,  alternates,  and  colla- 
terals, for  which  rules  were  given  so  that  readiness  indexes  for  subsystems  could  be  calculated 
from  those  for  components,  indexes  for  missions  from  those  for  subsystems,  and  so  on.  In  the 
end,  effects  of  changes  in  inventory  levels  for  parts  were  to  be  transmitted  up  the  hierarchy. 
We  review  this  project  in  more  detail  in  Section  2.  Dunlap  and  Associates  [7,  8]  and  Rupp  and 
Kronson  [24]  provide  some  details,  and  Cooper  [3]  presents  afterthoughts. 

Project  MARIS  was  a successor  to  METRI.  It  addressed  the  problems  of  relating  the 
material-support  budget  and  budgetary  changes  to  the  operational  capability  of  the  Polaris 
weapons  system  and  of  assessing  impacts  of  changes  in  the  logistics  support  system  on  the 
operational  capability.  It  was  a very  large  multiechelon  effort  involving  many  data  analyses, 
numerous  theoretical  models,  several  simulations,  and  great  complexity.  We  include  it  here 
because  an  attempt  was  made  to  provide  a single  readiness  index  to  measure  the  performance 
of  a complex  military  system.  Changes  "down  below,"  say  in  repair  parts  support,  were  to  be 
transmitted  "to  the  top"  where  they  were  to  be  read  off  as  changes  in  the  readiness  index. 
Details  are  to  be  found  in  the  General  Electric  Company  manual  [10];  the  methodology  is  dis- 
cussed in  Section  2. 

The  MAXCAP  model  of  Schnelker  [25,  26]  is  intended  for  use  in  preparing  ships’ 
allowance  lists.  It  fits  well  into  approach  (3)  because  it  in  effect  involves  a maximization  of 
ships’  capability  (readiness  index)  subject  to  a stipulated  budget.  It  again  uses  a hierarchical 
model.  But  it  should  be  noted  that  it  was  an  internal  Navy  effort  at  the  Fleet  Material  Support 
Office  that  was  far  smaller  than  the  contract  efforts  in  METRI  and  MARIS. 

A motivating  factor  common  to  all  of  these  efforts  is  the  need  to  measure  the  effects  of 
budgetary  changes  on  the  readiness  of  large-scale  complex  systems.  The  studies  mentioned 
above  attempted  to  index  readiness  as  a funtion  of  factors  that  are  influenced  by  budgetary  con- 
straints. However,  the  readiness  indexes  propose^  do  not  attain  the  desired  objective.  They 
are  usually  very  insensitive  to  changes  that  occur  at  the  lower  echelons  and,  furthermore,  they 
are  generally  improper  indexes  of  readiness. 

2.  METHODOLOGY 

In  this  section  we  discuss  four  studies,  two  from  Section  1.1  and  two  from  1.3. 

2.1  Two  Examples  of  Data  Analysis 

Let  us  again  consider  the  Logistics  Research  Project  report  [9],  where  every  ship  is 
represented  as  a collection  of  eleven  subresources  (propuliion,  navigation,  communication, 
weapon  systems,  personnel,  and  so  on).  Each  subresource  is  given  a grade  by  the  commanding 
officer,  C-l,  C-2,  C-3,  or  C-4,  as  previously  described.  The  question  in  [9]  is  how  to  analyze 
the  vectors  of  eleven  grades  obtained  periodically  from  each  ship  to  obtain  a pattern  of  readi- 
ness for  the  individual  ships  and  for  the  entire  fleet.  A methodology  for  such  an  analysis  is 
proposed  based  on  conversion  of  the  C-ratings  grades  to  numerical  scores  by  assignment  of 
values  0 to  C-l,  1 to  C-4,  and  px  and  p2  (0  < P|  < p2  < 1)  to  C-2  and  C-3,  respectively. 
(These  numerical  scores  reflect  the  state  of  unreadiness  rather  than  the  state  of  readiness.) 
The  state  of  readiness  of  the  whole  ship  is  expressed  as  the  minimal  C-grade  of  the 
subresources  (the  worst  readiness  rating).  The  state  of  unreadiness  of  the  whole  fleet  is 


26 


Z.  BARZILY,  W.  H.  MARLOW  AND  S.  ZACKS 


expressed  as  an  average  of  the  unreadiness  of  the  individual  ships.  This  average  does  not 
reflect  the  extent  of  unreadiness  in  the  sense  of  how  difficult  it  is  to  improve  readiness  (or,  in 
other  words,  how  many  subresources  should  be  improved  before  readiness  is  improved).  For 
the  purpose  of  obtaining  this  additional  information,  a fleet  measure  T is  constructed  in  the  fol- 
lowing manner.  A subresource  of  a given  ship  is  called  visible  if  it  agrees  with  the  total  rating 
of  the  ship.  Let  Mu  be  the  number  of  ships  in  the  fleet  having  a visible  Ah  subresource,  being 
equal  to  C-J  (i  — 1,.  . . , 11;  j — 1,.  . . ,4).  A total  fleet  score  for  the  Ah  subresource  is 
defined  then  as 

v,  - Mu  • 0 + Ml2p\  + MiM  + mi4  ■ 1 

~P\Mi2  + PiM^  + Mi4,  i — 1,.  . . , 11. 


u 

The  measure  of  difficulty  for  improving  the  state  of  unreadiness  is  T - £ v,. 

i-i 

The  method  discussed  above  is  an  attempt  to  quantify  the  qualitative  C-ratings  of  ships 
and  to  measure  the  state  of  unreadiness  of  the  fleet  by  proper  averages  of  the  indexes  obtained. 
The  quantification  method  depends  on  arbitrarily  assigned  p\  and  p2  values  for  the  categories 
C-2  and  C-3.  In  addition,  the  indexes  are  based  on  the  minimum  rating  value  of  the  eleven 
subresources  of  a ship.  This  measurement  of  unreadiness  may  lose  important  information  con- 
cerning the  type  of  subresources  that  cause  low  readiness  values.  Different  ships  may  be 
classified  as  having  the  same  readiness  level  although  their  readiness  problems  may  be  substan- 
tially different.  There  is  some  doubt  as  to  whether  or  not  the  assignment  of  C-ratings  is  an 
effective  evaluation  method,  and  there  also  are  concerns  for  the  reliability  of  the  grades  pro- 
vided by  the  officers  concerned.  These  questions  deserve  special  study. 

The  methodology  for  the  Navy  Readiness  Analysis  System  in  the  Institute  of  Naval  Stu- 
dies report  [20]  gives  procedures  for  expressing  the  level  of  readiness  of  Navy  destroyers  as 
certain  functions  of  the  Refresher  Training  Operational  Readiness  Inspection,  briefly  OR1, 
scores.  The  study  involves  82  destroyers  and  is  designed  to  analyze  the  relationship  between 
resource  variables  and  performance  scores.  The  ORI  scores  relate  to  29  areas,  of  which  21  are 
related  to  mission  functions,  as  antiair  warfare,  antisubmarine  warfare,  surface  warfare,  com- 
mand and  control  communications,  mobility,  and  casualty  control.  The  resource  areas  con- 
sidered are  personnel,  training,  equipment,  and  supply.  Thus,  the  original  performance  data 
consist  of  82  vectors  (one  for  each  ship)  of  21  components.  Each  component  (an  ORI  score)  is 
provided  by  a team  of  inspectors.  As  anticipated,  the  21  scores  of  the  various  subsections  of  a 
ship  are  correlated  and  some  subsections  are  highly  correlated.  By  applying  principal  component 
analysis  (see  [201  for  special  details  or  Cooley  and  Lohnes  [21  as  a general  reference)  the  scores 
of  the  21  subsections  are  reduced  to  eight  linear  combinations,  with  weights  given  by  the  eigen- 
vectors corresponding  to  the  largest  eigenvalues.  Factor  analysis  is  then  performed  and  a 
rotated  three-factor  system  provides  the  most  interpretable  solution.  Factor  1,  named  control 
procedures,  involves  six  performance  variables  concerning  tactical  information  (resulting  from 
the  interpretation  of  radar  data).*  Factor  II,  named  casualty  control  procedures,  involves  four  per- 
formance variables  concerned  with  procedures  of  preventing  damages  and  effecting  repairs. 
Factor  III,  named  antisubmarine  war/are  tactical  communications,  is  defined  by  three  performance 
variables  which  measure  a series  of  activities  with  the  chain  of  communications.  The  perfor- 
mance of  each  ship  is  then  expressed  by  three  values  corresponding  to  the  three  factor  scores. 
Ships  can  be  clustered  into  homogeneous  groups  according  to  these  three  factor  scores.  The 
dimensionality  of  the  data  has  been  reduced  from  21  correlated  variables  to  three  uncorrelated 
factors. 


SURVEY  OF  APPROACHES  TO  READINESS 


27 


I 


An  important  question  is  how  the  four  resources,  personnel,  training,  equipment,  and 
supply,  affect  the  readiness-factor  scores.  For  this  purpose  multiple-regression  analysis  is  per- 
formed for  each  one  of  the  factor  scores  on  the  various  variables  characterizing  the  four 
resource  categories.  This  analysis  shows  the  relative  importance  of  the  various  resources  on 
performance-readiness  factors.  It  can  provide  information  on  possible  interactions  between 
different  resource  categories  (personnel  and  training,  equipment  and  inventory  management, 
and  so  on).  In  addition,  the  regression  analysis  provides  the  means  for  readiness  estimation, 
given  the  status  of  the  resource  variables.  For  specific  details  see  the  Institute  of  Naval  Studies 
report  [20], 

2.2  Two  Examples  Based  on  Hierarchical  Structures 

The  methodology  of  the  METRI  project  as  reported  by  Cooper  (3)  and  by  Dunlap  and 
Associates  17 ,8]  was  to  construct  a huge  hierarchical  structure  modeling  a Navy  ship  (destroyer) 
and  to  compose  a readiness  index  from  readiness  values  of  its  elementary  units  by  certain  rules. 
Readiness  indexes  are  to  be  constructed  for  each  component  (elementary  unit)  according  to  the 
capability  of  its  parts  to  function  properly  throughout  the  mission  period.  These  indexes  are  to 
be  functions  of  the  reliability  of  the  parts  (the  failure  process)  and  the  number  of  spare  replace- 
ment parts  available.  Let  Rx,  ...  ,Rn  denote  the  readiness  indexes  of  the  components  in  the  Ah 
subsystem  (i  — l,...,k);  then  the  readiness  index  of  the  subsystem  is  a function 

Rs.  - <M/?i R„).  / ™ 1, 

The  readiness  of  the  whole  system  is  a function  RT  - /(/?S] RSJ.  The  problem 

(apparently  insolvable)  is  to  determine  suitable  functions  for  the  composition  of  the  readiness 
indexes  to  serve  as  an  overall  index.  For  this  purpose  it  was  assumed  that  the  hierarchical 
structure  of  a ship  can  be  uniquely  described  as  a combination  of  the  following  four  basic  struc- 
tures: 

(i)  If  R\...  ,R„  are  the  readiness  indexes  of  n components  connected  in  series,  then  the 
readiness  of  the  structure  is 

rt-  |n*“' 

where  a,  and  A are  empirical  coefficients  for  the  specific  items. 

(ii)  Supplement  structure:  If  n items  independently  supplement  one  another  (for  example, 
sonar,  surface  radar,  and  air  radar  for  detection  of  enemies)  then 


The  parameters  K,  a,(i  «■  1 n)  provide  for  the  relative  importance  of  the  items. 

(iii)  Alternative  structure:  If  a main  system  whose  readiness  is  has  a standby  unit 
(readiness  R 2)  to  replace  it  in  case  it  fails,  then 

R,  -Ml  - RX)KR2. 

The  parameter  K expresses  the  relative  ability  of  the  standby  unit  to  replace  the  main  one. 


28 


Z.  BARZILY.  W.  H.  MARLOW  AND  S.  ZACKS 


U 


V 


(iv)  Collateral  structure.  Let  R i denote  the  readiness  of  a unit  that  is  essential  to  the 
operation  of  a ship.  Assume  that  there  is  a collateral  element  that  affects  the  system’s  readi- 
ness in  the  presence  of  the  essential  unit.  For  example,  the  collateral  element  might  provide 
for  the  maintenance  of  the  essential  unit.  Let  R 2 denote  the  readiness  index  of  the  collateral 
unit.  The  readiness  of  this  structure  is  given  by 

Rt- R\[K  + (1  - K)RJ. 


In  summary,  it  was  supposed  that  by  applying  the  rules  for  calculating  the  readiness  of  the  basic 
structures  one  can  calculate  the  readiness  of  a ship. 


A sensitivity  analysis  was  proposed  to  show  the  rate  of  change  in  the  overall  readiness  as  c 
function  of  changes  in  the  number  of  spare  parts  assigned.  Such  an  analysis  was  designed  to 
answer  the  question  of  the  effect  of  changes  in  the  inventory  levels  on  the  readiness  of  a ship. 
The  proposed  analysis  is,  however,  vague  in  publications  on  METRI.  The  whole  approach 
appears  to  twe  been  found  to  be  theoretically  invalid  and  practically  intractable. 

As  mentioned  earlier,  the  main  objective  of  the  General  Electric  Company  project  MARIS 
(10]  was  to  relate  the  system  of  material  support  to  the  operational  capability  of  the  Polaris 
weapons  system.  The  readiness  index  was  the  expected  proportion  of  operational  missiles  in  a 
specified  period  of  time. 


The  system  considered  is  a three-echelon  support  system  that  contains  four  squadrons  of 
submarines  (first  echelon)  with  one  squadron  assigned  to  each  of  four  tenders  (second 
echelon).  The  tenders  reorder  from  stock  points  which  procure  material  from  outside  sources. 
The  stock  points,  the  inventory  control  points,  the  repair  facilities,  and  industrial  sources  con- 
stitute the  third  echelon.  Superimposed  on  the  above  three-echelon  structure  is  a transporta- 
tion system  for  moving  material  among  the  various  system  elements.  Routine  replenishment  is 
provided  by  four  cargo  ships,  one  assigned  to  each  of  the  tenders,  while  occasional  high-priority 
transportation  is  also  available. 


The  basic  MARIS  procedure  aims  to  relate  the  budget  for  replenishment  to  readiness  of 
submarines  via  a budget  model,  a three-echelon  simulation  model,  and  a submarine-readiness 
model. 

The  budget  model  simulates  the  estimated  procurement  expenditure  for  parts.  The 
three-echelon  simulation  model  provides  a detailed  representation  of  the  Navy  support  system 
and  it  simulates  actions  taken  and  resulting  effects  of  all  possible  events.  The  subntarine- 
' readiness  model  is  an  analytic  model  for  the  evaluation  of  the  readiness  of  a submarine. 

The  submarine-readiness  model  is  the  essential  part  of  the  project.  This  model  deter- 
mines the  readiness  of  a submarine  as  a function  of  the  onboard  inventory  of  spare  parts.  Let 
us  give  a simple  example  to  show  how  the  readiness  is  calculated.  The  spare  parts  treated  are 
related  to  missiles  and  are  replaceable  on  patrol.  Suppose  that  a certain  part  has  at  the  begin- 
ning of  a patrol  n units  in  stock  and  is  installed  in  m different  applications.  For  the  sake  of  sim- 
plicity, it  is  assumed  that  the  part  has  exponentially  distributed  independent  life  times  at  the 
various  applications,  with  intensity  parameters  X, Am.  If  t denotes  the  first  instant  of 

_ — m 

stockout,  the  probability  distribution  function  of  r is  the  gamma  g(f'|x,  n)  where  X - £xy  ; 

• y-i 

that  is, 

*<r|x,  n)  - -=~r-'e-J'.0  < r<  - . 


I 


SURVEY  OF  APPROACHES  TO  READINESS 


29 


Given  that  stockout  occured  at  time  t,  then  the  probability  that  all  m units  will  still  be  operating 
y units  of  time  after  stockout  is  given  by 

- II*_V-e*p(-X,) 

7-1 


The  readiness  index  conditional  on  the  time  t of  stockout  was  defined  as 

L t < T. 


I 1 fv 

*r0)  --=  + -=  J exp(-  \y)dy 
1 1 o 


- 1,  t > T. 

Notice  that  the  product  of  T and  Rr(i)  equals  the  conditional  expected  length  of  life  of  the  sys- 
tem in  a patrol,  given  that  a stockout  occurred  at  time  t.  Finally,  the  readiness  index  related  to 
this  part  with  n units  in  stock  is  calculated  by  randomization,  as  follows: 

•• 

RT(n)  - f RT(t)g(t |X,  n)dt  . 
o 

Now  Rr(n)  is  a reasonable  index  of  readiness  as  a function  of  a particular  part.  But  the  ques- 
tion is,  how  can  one  combine  these  indexes?  No  satisfactory  answer  appears  to  have  been 
given. 

REFERENCES 

[1]  Barish,  N.N.,  and  S.  Ehrenfeld,  'Utilities  Estimated  from  Actual  Decisions  in  Readiness 
Measurement,”  Technical  Report  No.  14,  Graduate  School  of  Public  Administration, 
New  York  University,  New  York,  N.Y.  (1975)  (LD  35213A}.* 

12]  Cooley,  W.W.,  and  P.R.  Lohnes,  Multivariate  Data  Analysis  (Wiley,  New  York,  1971). 

13]  Cooper,  G.,  "METR1  and  the  Allowance  List  Problem,”  Management  Science  13,  B293-326 

(1967). 

[4]  Denicoff,  M.,  J.  Fennell,  S.E.  Haber,  W.H.  Marlow,  F.W.  Segel,  and  H.  Solomon,  The 
Polaris  Military  Essentiality  System,"  Naval  Research  Logistics  Quarterly  11,  235-257 
(1964). 

15]  Denicoff,  M.,  J.P.  Fennell,  and  H.  Solomon,  'Summary  of  a Method  for  Determining  the 

Military  Worth  of  Spare  Parts,"  Naval  Research  Logistics  Quarterly  7,  221-234  (1960). 

16]  DenicofT,  M.,  S.E.  Haber,  and  T.C.  Varley,  "Military  Essentiality  of  Naval  Aviation  Repair 

Parts,"  Management  Science  13,  B439-453  (1967). 

[71  Dunlap  and  Associates,  "METRI  Pilot  Program  Report,  USS  Ellison  (DD-864)"  (Darien, 
Connecticut,  1964)  (LD  08174A). 

[8]  Dunlap  and  Associates  "METRI  Personnel  Readiness  Measurement  (Darien,  Connecticut, 

1965)  (LD  08175BJ. 

[9]  Frank,  S.A.,  W.B.  Gruttke,  W.H.  Marlow,  and  S.J.  Mathis,  "Readiness  Measurements  via 

Subresource  C-Ratings,”  Technical  Paper  Serial  T-216,  Logistics  Research  Project,  The 
George  Washington  University,  Washington,  D.C.  (1968)  (LD  19644). 

[10]  General  Electric  Company,  "MARIS  Technical  Manual,"  (TEMPO,  Santa  Barbara,  Califor- 

nia, 1969)  (LD  32578  MA). 

[11]  Gaver,  D.P.,  and  M.  Mazumdar  "Statistical  Estimation  in  a Problem  of  System  Reliability," 

Naval  Research  Logistics  Quarterly  14,  473-488  (1967). 


"LD  numben  identify  document!  that  can  be  obtained  from  the  Defenae  Logistics  Studies  Information  Exchange,  U.  S. 
Army  Logistics  Management  Center,  Fort  Lee,  VA  23801. 


30 


Z.  BARZILY,  W.  H.  MARLOW  AND  S.  ZACKS 


112]  Greenberg,  I.,  "On  the  Determination  of  the  "Closeness"  to  Complete  Readiness  and  of 
Dynamic  Readiness,"  Technical  Report  No.  2,  Graduate  School  of  Public  Administra- 
tion, New  York  University,  New  York,  N.Y.  (1972)  {LD  28693A) 

|13]  Hamilton,  J.E.,  "Ship  Material  Readiness,"  Technical  Paper  Serial  T-145,  Logistics 
Research  Project,  The  George  Washington  University,  Washington,  D.C.  (1962)  {LD 
05636). 

114]  Kaplan,  S.,  "A  Production  Function  Approach  to  the  Measurement  of  Short  Term  Readi- 
ness of  Navy  Units,"  Technical  Report  No.  1,  Graduate  School  of  Public  Administra- 
tion, New  York  University,  New  York,  N.Y.  (1972)  {LD  28693). 

[15]  Kaplan,  S.,  "Readiness  and  the  Optimal  Redeployment  of  Resources,"  Technical  Report 

No.  4,  Graduate  School  of  Public  Administration,  New  York  University,  New  York, 
N.Y.  (1972)  {LD  28829). 

[16]  Kaplan,  S.,  "Application  of  Programs  with  Maximum  Objective  Functions  to  Problems  of 

Optimal  Resource  Allocation,”  Operations  Research  22,  802-807  (1974). 

[17]  Kaplan,  S.  "A  Note  on  a Constrained  Replacement  Model  for  Ships  Subject  to  Degradation 

in  Utility,"  Naval  Research  Logistics  Quarterly  21,  563-568  (1974)  {Cf.  LD  32736A). 
[181  Kaplan,  S.,  "An  Approach  to  the  Measurement  of  the  Short  Term  Readiness  of  Military 
Systems,"  Technical  Report  No.  10,  Graduate  School  of  Public  Administration,  New 
York  University,  New  York,  N.Y.  (1975)  {LD  34461A). 

[19]  Kaplan,  S.,  "Readiness  Analysis  for  Ships  Where  both  Modernization  and  Replacement  are 

Used  to  Increase  Fleet  Readiness,"  Technical  Report  No.  13,  Graduate  School  of  Public 
Administration,  New  York  University,  New  York,  N.Y.  (1975)  {LD  35512A]. 

[20]  Lockman,  R.F.,  P.H.  Stoloff,  B.H.  Manheimer,  J.R.  Hardgrave,  and  W.F.  Story,  Navy 

Readiness  Analysis  System  Methodology  Study,  Vol.  1,"  Institute  of  Naval  Studies  of 
the  Center  of  Naval  Analyses  Study  No.  27,  University  of  Rochester,  Rochester,  New 
York  (1970)  {Cf.  LD  25118  and  LD  251 18A). 

[21]  Marlow,  W.H.,  ed.,  Modern  Trends  in  Logistics  Research,  (Massachusetts  Institute  of  Tech- 

nology Press,  Cambridge,  Massachusetts,  1976). 

[22]  Mazumdar,  M.,  "Uniformly  Minimum  Variance  Unbiased  Estimates  of  Operational  Readi- 

ness and  Reliability  in  a Two-State  System.  Naval  Research  Logistics  Quarterly  16, 
199-206  (1969). 

[23]  McVoy,  J.L.,  "The  Concept  of  Materiel  Readiness  as  Applied  to  Naval  Vessels,"  Logistics 

Management  Institute,  Washington,  D.C.  (1970)  {LD  26719). 

[24]  Rupp,  A.E.,Jr.,  and  E.T.  Kronson,  "Final  Report  on  METRI  Data  Processing  Program," 

Vitro  Laboratories,  Silver  Spring,  Maryland  (1964)  {LD  08243). 

[25]  Schnelker,  H.J.  "Max-Cap  Allowance  Model,"  Alrand  Report  48,  Application  Development 

Division,  Data  Systems  Support  Office,  Mechanicsburg,  Pennsylvania  (1965)  {LD 
08595). 

[26]  Schnelker,  H.J.,  "Max -Cap  Allowance  Model  (Revised),"  Alrand  Report  48 A,  Operations 

Analysis  Department,  U.  S.  Navy  Fleet  Material  Support  Office,  Mechanicsburg, 
Pennsylvania  (1966)  {LD  08595 A). 

[27]  Segel.  F.W.,  "The  Methodology  for  Improved  Inspection  Procedures  by  the  U.  S.  Navy 

Board  of  Inspection  and  Survey,"  Technical  Paper  Serial  T-160,  Logistics  Research  Pro- 
ject, The  George  Washington  University,  Washington,  D.C.  (1964)  {LD  07860). 

[281  Solomon,  Henry,  "A  Note  on  a First  Application  of  Clustering  Procedures  to  Fleet  Materi- 
al Condition  Measurements,"  Naval  Research  Logistics  Quarterly  18,  415-421  (1971). 


SURVEY  OF  APPROACHES  TO  READINESS 


31 


[29]  Tolins,  I.S.,  "A  Continuous  Time  Markov  Process  Model  of  Naval  Operational  Readiness," 

Technical  Paper  Serial  T-214,  Logistics  Research  Project,  The  George  Washington 
University,  Washington,  D.C.  (1968)  (LD  19645). 

[30]  Zacks,  S.,  "Bayes  Adaptive  Estimation  of  Current  Operational  Readiness  Parameters,” 

Technical  Memorandum  Serial  TM-61018,  Program  in  Logistics,  The  George  Washing- 
ton University,  Washington,  D.C.  (1969)  [LD  39954A). 

[31]  Zacks,  S.  "Review  of  Statistical  Problems  and  Methods  in  Logistics  Research,"  in  Modern 

Trends  in  Logistics  Research,  W.H.  Marlow,  ed.  (Massachusetts  Institute  of  Technology 
Press,  Cambridge,  Massachusetts,  1976),  pp.  227-247. 


ADAPTIVE  DISPOSAL  MODELS* 


C.  Derman 

Columbia  University 
New  York.  N.  Y. 

G.  J.  Lieberman 

Stanford  University 
Stanford,  California 

S.  M.  Ross 

University  of  California,  Berkeley 
Berkeley,  California 

ABSTRACT 

This  paper  recontideri  the  classical  model  for  selling  an  asset  in  which  offers 
come  in  daily  and  a decision  must  then  be  made  as  to  whether  or  not  to  sell. 
For  each  day  the  item  remains  unsold  a continuation  (or  maintenance  cost)  e is 
incurred.  The  successive  offers  are  assumed  to  be  independent  and  identically 
distributed  random  variables  having  an  unknown  distribution  F.  The  model  is 
considered  both  in  the  case  where  once  an  offer  is  rejected  it  may  not  be  recalled 
at  a later  time  and  in  the  case  where  such  recall  of  previous  offers  is  allowed. 


1.  INTRODUCTION 

This  paper  reconsiders  the  classical  model  for  selling  an  asset  in  which  offers  come  in 
daily  and  a decision  must  then  be  made  as  to  whether  or  not  to  sell.  For  each  day  the  item 
remains  unsold  a continuation  (or  maintenance  cost)  c is  incurred.  The  successive  offers  are 
assumed  to  be  independent  arid  identically  distributed  random  variables  having  an  unknown 
distribution  F.  The  model  is  considered  both  in  the  case  where  once  an  offer  is  rejectet)  it  may 
not  be  recalled  at  a later  time  and  in  the  case  where  such  recall  of  previous  offers  is  allowed. 

In  Section  2 we  show  how  bounds  on  the  optimal  policy  may  be  obtained  when  some  par- 
tial information  about  F is  available.  In  particular,  we  show  that  if  F,  the  distribution  of  offers, 
satisfies  the  NWUE  (new  worse  than  used  in  expection)  property  defined  as 

Ef  IX  - a\X  > a)  > Er  [X]  for  all  a > 0, 

then  the  optimal  policy  has  a monotonic  relationship  with  the  optimal  policy  in  the  case  where 
the  distribution  of  offers  is  exponential  with  the  same  mean  as  F. 

•This  research  was  supported  in  pert  by  the  Office  of  Naval  Research  under  contracts  N000I4-75-C-0620  with  Columbia 
University,  NO0OI4-75-C-O561  with  Stanford  University,  and  N00014-6I-A-0200-1036  with  the  University  .of  California, 
and  by  the  U.S.  Army  Research  Ofllce-Durham  under  contract  DAHC04-74-C -00226  with  the  University  of  California. 


34 


C.  DERM  AN,  G J.  LIEBERMAN  AND  S.  M.  ROSS 


In  Sections  3 and  4 we  consider  a Bayesian  version  of  this  model  by  supposing  that  F is 

known  to  be  one  of  the  distributions  Fh  F2 F„  with  given  initial  prior  probabilities.  In 

Section  3 we  do  not  allow,  and  in  Section  4 we  do  allow,  the  recall  of  old  offers.  In  both  cases 
we  provide  bounds  on  the  optimal  policy  in  terms  of  the  optimal  policies  in  the  case  where  it  is 
known  which  of  the  F,  is  equal  to  F.  This  Bayeisan  format  has  previously  been  considered  in 
[3],  which  assumed  that  F was  a normal  random  variable  with  known  variance  and  imposed  a 
normal  prior  distribution  on  the  mean  of  F.  As  our  model  imposes  no  parametric  condition  on 
F in  the  prior  distribution,  the  type  of  results  we  obtain  are  somewhat  different  than  those  in 
13). 

2.  INDEPENDENT  AND  IDENTICALLY  DISTRIBUTED  OFFERS  FROM  AN 
UNKNOWN  DISTRIBUTION  WITH  PARTIAL  INFORMATION 

If  the  successive  offers  were  independent  and  identically  distributed  random  variables  hav- 
ing known  distribution  F,  then  it  is  well-known  [2]  that  the  policy  that  maximizes  the  total 
expected  return,  both  with  and  without  recall,  is  to  accept  an  offer  x if  and  only  if  x > xF, 
where  xF  is  the  smallest  value  such  that 

xF  > xdF(x)  — cj  / 11  — F(xf)]. 

If  Fis  continuous,  this  reduces  to 

c - f (x  - xF)  dF(x) . 

The  optimal  expected  return  is  xF  + c. 

We  shall  start  our  by  comparing  the  optimal  critical  number  for  two  different  distributions. 
To  begin  we  need  the  following  definition: 

DEFINITION:  For  any  two  probability  distributions  Fand  G we  say  that  F < G if 

/ /(x)  dF(x)  < / f(x)  dG(x ) 
for  all  increasing  convex  functions  f. 

If  Fand  G have  the  same  means,  then  F < G intuitively  means  that  Fhas  less  variability 
than  G. 

PROPOSITION  1:  If  F < £7  then  xF  < xG. 

v 

PROOF:  xc  is  the  smallest  value  satisfying 

c > Eg  Itf  -xc)+l. 

Now 

Eg  IU-x<;)+]  > Ef  l(JI(-Xc)+], 
since  f(x)  - (x  - Xg)+  is  an  increasing  convex  function.  Hence, 

c > Ef  UX  -xcV). 

implying  that  xF  < xg- 

Proposition  2 is  concerned  with  the  return  from  a nonoptimal  policy: 


i 


:! 


I 


f 

I 


V 


L 


T 


f 

| 


ADAPTIVE  DISPOSAL  MODELS  35 

PROPOSITION  2:  If  x < xF , then  the  policy  that  accepts  the  first  offer  that  is  at  least  as 
large  as  x has  a return  that  is  at  least  x + c. 

PROOF:  To  prove  the  above,  consider  the  expected  difference  between  the  optimal  pol- 
icy that  uses  the  critical  number  xF  and  the  above  policy  that  uses  the  critical  number  x.  By 
conditioning  on  whether  an  offer  between  x and  xF  occurs  before  or  after  an  offer  greater  than 
xF,  we  see  that  the  expected  difference  is  at  most  xF  — x in  the  former  case  (since  the  expected 
return  from  the  optimal  policy  starting  at  the  time  of  this  offer  between  x and  xF  is  equal  to 
xv  + c - c - xF)  and  it  is  0 in  the  latter  case.  Hence,  the  result  follows. 

DEFINITION:  We  say  that  the  distribution  F,  with  F(O-)  - 0,  is  NWUE  if 

fa  > So  * f°r  al1  °>0' 

where  Fix)  - 1 - Fix).  (If  X is  a random  variable  having  distribution  F,  then  the  above  is 
equivalent  to  E[X  - a\X  > a]  > £1*].) 

PROPOSITION  3:  If  F is  NWUE  with  mean  M,  then 

£(m)  < F 
v 

where  E(fi)  is  an  exponential  distribution  with  mean  /x. 

PROOF:  It  is  easy  to  show  that  F > G is  equivalent  to 

v 

fa  Fix)  dx  > £ Gix)  dx  for  all  a. 

Thus,  we  have  to  show  that, 

fo  Fix)  dx  > 

whenever  Fis  NWUE.  By  the  definition  of  NWUE  we  have 

f”  dx  > Fit) 

or,  equivalently, 

F(r)/J]“  Fix)  dx  < 1/M. 

Integrating  both  sides  of  the  above  from  0 to  a completes  the  proof. 

We  are  now  ready  for  the  main  theorem  of  this  section. 

THEOREM  1:  If  the  unknown  distribution  Fis  known  to  be  NWUE  and  to  have  mean  n, 

then 

xF  > x 

and  the  policy  which  accepts  the  first  offer  of  at  least  x has  return  of  at  least  x + c,  where 

x - — m log  ic/n). 

PROOF:  The  result  follows  immediately  from  Propositions  1,  2,  and  3,  since  x - xe 
when  E is  an  exponential  distribution  with  mean  n. 


1 


ft 


36 


C.  DERMAN,  G.  J.  LIEBERMAN  AND  S.  M.  ROSS 


REMARK:  One  instance  in  which  the  distribution  of  offers  would  be  NWUE  is  the  case 
in  which  there  are  many  classes  of  potential  customers  and  offers  from  each  class  follow  an 
exponential  distribution.  Thus  the  distribution  of  offers  would  be  a mixture  of  exponential  dis- 
tributions and  the  degenerate  distribution  at  0 (indicating  no  offer),  and  it  would  thus  be 
NWUE  (since  a mixture  of  NWUE  random  variables  is  also  NWUE)  (1  j. 

3.  BAYESIAN  MODEL  WITHOUT  RECALL  OF  PAST  OFFERS 

In  this  section  we  suppose  that  if  an  offer  is  rejected  then  it  can  never  be  accepted  in  the 
future.  In  addition,  we  suppose  that,  although  the  distribution  F is  not  known  with  certainty, 
we  do  know  that  it  is  one  of  the  distributions  F,,  F2.  , F„,  with  given  prior  probabilities. 

We  say  that  the  state  of  the  system  is  (x,  P)  when  x is  the  present  offer  under  consideration 
and  P — (Pj,  ...  , P„)  is  the  posterior  probability  vector,  given  all  the  information  that  we 
have  accumulated  up  to  that  point  (including  the  present  offer  x),  as  to  which  of  the  F J is  the 
actual  distribution. 

Also  we  define  V (x,  P)  to  be  equal  to  the  expected  return  from  this  day  onward,  given 
that  the  state  today  is  (x,  P)  and  we  employ  an  optimal  policy.  (If  we  assume  as  we  do,  that 
each  of  the  F(  has  a finite  variance  and  c > 0 then  it  can  be  shown  as  in  [2]  that  an  optimal 
policy  exists.) 

The  optimality  equation  thus  takes  the  following  form: 

K(x,  P)  - max  {x,  K(P)  - c), 

where  KP),  which  represents  the  best  you  can  do  when  the  distribution  is  chosen  by  the  prior 
probability  vector  P — (Pi Pn ),  satisfies 

V(V)  - I Pj  f V(y,  Ty  P)  dFj  (y), 

where 

T,  P-  [(r,P), (Ty  P)J 

and 

(Ty  P )j  - Prob  |F, |P,  y } 

Pj  dFj  (y) 

“ IP/  dF,  (y)  ■ 

Furthermore,  the  optimal  policy  accepts  the  offer  in  state  (x,  P)  if  and  only  if 

x > K(P)  - c. 


PROPOSITION  4:  KP)  is  a convex  function  of  P. 

PROOF:  Recall  that  KP)  represents  the  best  we  can  do  when  the  distribution  is  chosen 
according  to  P.  Now  suppose  P - X P1  + (1  - X)  P2,  for  some  0 < X < 1,  and  suppose  that 
‘the  distribution  to  be  used  is  to  be  chosen  according  to  the  following  two-stage  experiment. 
First  we  flip  a coin  having  probability  X of  coming  up  heads.  If  the  coin  comes  up  heads,  then 
we  choose  the  distribution  according  to  the  prior  probability  P1 , and  if  it  comes  up  tails  then  we 
use  P2.  Now  if  we  are  not  told  the  outcome  of  the  coin  flip  then  the  problem  is  exactly  the 
same  as  if  the  distribution  was  chosen  according  to  P and  thus  the  best  we  can  do  is  KP).  On 
the  other  hand,  if  we  are  to  be  told  about  the  outcome  of  the  coin  flip,  then  by  conditioning  on 
the  outcome  we  see  that  our  expected  return  if  we  play  optimally  is  X F(P')  + (1  - X)  F(P2). 
Hence,  as  additional  information  can  not  lower  our  expected  return,  we  see  that 


38 


C.  DERMAN,  G.  J.  LIEBERMAN'  AND  S.  M.  ROSS 


PROOF:  Since  V(P)  is  a convex  function  of  P (Proposition  4),  the  result  would  follow  if 
we  could  show  that 

F(0)  < V(P)  for  allO  < P < 1. 

Now  K( 0)  — xF>  + c = Xj  + c.  Also,  as  it  is  always  optimal  to  reject  an  offer  less  than 
min(x,,  x2)  - x,,  it  follows  from  the  optimality  equation  that 

V(P)  - c > x,  for  all  P, 

which  proves  the  result. 

Thus,  when  n - 2 and  x2  < x2,  it  is  optimal  to  accept  the  offer  when  in  state  (x,  P)  if 
and  only  if  x > h(P),  where  h(P)  = F(P)  - c is  an  increasing  convex  function  of  P with 
h(0)  - xlt  /» (1)  - x2.  Furthermore,  bounds  on  h(P)  are  given  by  Proposition  5. 

REMARK:  There  does  not  appear  to  be  an  analogue  to  Theorem  2 when  there  are  more 
than  2 possible  distributions.  For  instance,  suppose  that  the  distributions  F\,  F2,  ...  . F„  are 
stochastically  increasing  in  the  sense  that  F,(t)  is  nonincreasing  in  / for  each  t.  If  we  define  the 
probability  vector  P to  be  greater  than  or  equal  to  the  probability  vector  Q,  written  P > Q,  if 

i,  F < £ Qi  for  each  j - 1,  ...  n, 

i i 

then  we  might  hope  to  prove  that  K(P)  > F(Q).  However,  this  need  not  be  the  case,  as  is  indi- 
cated by  the  following  example.  Suppose  F\  puts  all  its  weight  on  the  value  0.9,  F2  puts  all  its 
weight  on  the  value  1 , and  F2  is  the  distribution  of  a random  variable  that  takes  on  the  value  1 
with  probability  0.99  and  (10)6  with  probability  0.01,  and  suppose  c — 1.  Now,  P = (0,  0.9, 
0.1)  > Q = (0.9,  0,  0.1),  but  it  turns  out  that  K(p>  < F(q),  the  reason  being  that  under  Q it 
only  takes  a single  observation  to  determine  the  true  f",  whereas  this  is  not  so  under  P. 

4.  INDEPENDENT  AND  IDENTICALLY  DISTRIBUTED  OFFERS  FROM  AN 
UNKNOWN  DISTRIBUTION  WITH  RECALL  OF  PAST  OFFERS 

In  the  previous  section  we  assumed  that  once  an  offer  was  rejected  by  the  decision  maker 
then  that  offer  immediately  disappears.  In  this  section,  however,  we  consider  the  same  model 
as  in  Section  3 but  with  the  exception  that  an  offer  remains  good  indefinitely  and  may  be 
accepted  at  any  time. 

It  turns  out  that,  when  the  distribution  of  offers  is  known,  then  the  optimal  policy  in  this 
case  is  identical  to  the  one  in  which  recalling  past  offers  is  not  allowed.  That  is,  the  optimal 
policy  is  to  accept  the  first  offer  that  is  at  least  as  large  as  xF,  and  the  expected  return  under  the 
optimal  policy  is  xF  + c,  when  xF  is  as  defined  in  Section  2. 

Consider  now  the  case  in  which  the  distribution  of  offers  is  one  of  the  distributions 
F\,  ... , F„,  where  the  F,  is  chosen  according  to  some  initial  probability  vector.  The  state  of 
the  system  at  any  time  can  be  defined  by  ( m , P),  where  m is  the  maximum  offer  that  has  been 
received  up  to  that  time  and  P is  the  posterior  probability  vector  (given  all  offers  up  to  that 
time,  including  any  just  made)  of  the  true  distribution.  The  optimality  equation  takes  the  form 

V(m, P)  - max  { m.  I P,  [jT"  V(m.  Ty  P)  dF,  ( y ) + fj  V(y.  Ty P)  ^(y)]  - c}, 

where 

r,p  - [(r,p), (r,p)„), 


ADAPTIVE  DISPOSAL  MODELS 


39 


and 


Wh 


Pj  dFj  (y) 
IP,  dF,  (j>)  ' 


While  it  follows  from  its  definition  that  F(m,  P)  is  an  increasing  function  of  m for  fixed 
P,  it  is  not  immediately  evident  from  the  optimality  equation  that,  if  the  offer  m is  accepted 
when  in  state  (m.  P),  then  the  ofTer  m*  is  also  accepted  when  in  state  (m1 , P)  whenever 
m'  > m.  We  now  prove  this. 

PROPOSITION  6:  For  fixed  P,  V(m,  P)  - m is  a nonincreasing  function  of  m. 

PROOF.  Suppose  m,  < m2.  Note  that  the  distribution  of  the  sequence  of  future  offers  is 
the  same  no  matter  whether  the  initial  state  is  (mi,  P)  or  (m2,  P),  since  it  only  depends  on  (x, 

P)  through  P.  We  can  then  conclude  that  if  the  initial  state  is  (mt,  P)  then,  by  following 

throughout  the  optimal  policy  for  the  initial  state  (m2,  P),  our  return  when  we  stop  is  with  in 
m2  - m,  of  what  it  would  have  been  if  the  initial  state  were  really  (m2,  P).  Therefore, 

V(m,  P)  < IP,  V(m,  e,)  — IP,  max{m,  x,}. 

COROLLARY  2:  If  it  is  optimal  to  accept  mt  when  in  state  (mt,  P),  then  it  is  optimal  to 

accept  m2  when  in  state  (m2,  P)  whenever  m2  > mt. 

PROOF:  If  y(m,,  P)  - m,,  then  from  Proposition  6 

K(m2,  P)  - m2  < 0. 

This  implies,  from  the  optimality  equation,  that 

V(m2.  P)  - m2. 


PROPOSITION  7:  For  fixed  m,  V(m,  P)  is  a convex  function  of  P. 
PROOF:  The  proof  is  identical  to  the  proof  of  Proposition  4 in  Section  3. 
COROLLARY  3:  V(m,  P)  < IP,  max(m,  x,). 


where  x,  = xF  . 

PROOF:  If  we  let  e,  be  the  vector  of  zeros  with  a one  in  the  ith  place  then 

, } m ifm  > x, 


Hence,  from  convexity 

V(m,  P)  < IP,  F(m,  e,)  - y/*  max(m,  x,}. 


PROPOSITION  8:  If  the  present  state  is  (m,  P)  then 

(i)  if  m > IP,  max(m,  x,)  then  it  is  optimal  to  accept  m. 

(ii)  if 


40 


C.  DERMAN,  G.  J.  LIEBERMAN  AND  S.  M.  ROSS 


m < I Pj 


f ydFi  O')  - C 

**  la—  ....  ....  n 

1 - pirn) 


then  it  is  optimal  to  look  at  another  offer. 


(iii)  if 


m < T.P{  mF ] im)  + f ydF,iy)  - c 

nt 


then  it  is  optimal  to  look  at  another  offer. 


PROOF:  Part  (i)  follows  directly  from  Corollary  3,  while  the  proofs  of  parts  (ii)  and  (iii) 
are  Identical  to  the  corresponding  results  of  Proposition  S in  Section  3.  \ 

l 

Suppose  now  that  n - 2 and  xx  < x2.  In  this  case  we  represent  the  state  by  im,  P)  when 
P is  the  posterior  probability  that  F2  is  the  true  distribution. 

THEOREM  3:  Vim,  P)  is  increasing  in  Pfor  fixed  m. 


PROOF:  As  in  the  corresponding  proof  of  the  previous  section,  we  need  to  show  that 

Vim,  0)  < Vim,  P). 

Now, 

V(m,  0)  - max(m,  x,). 

However, 

V(m,  P)  > m, 

and,  as  it  follows  from  Part  (ii)  of  Proposition  8 that  it  is  never  optimal  to  accept  an  offer  less 
than  x,,  we  have 

V(m,  P)  > xi. 

That  is, 

m > x,  — * V(m,  P)  > x, 
m < X\->  V(m,  P)  - K(x,,  P)  > x,( 
and  the  proof  is  complete. 


Hence,  when  n - 2 and  x,  < x2,  it  is  optimal  to  accept  m when  in  state  (m,  P)  if  and 
only  if  m > m(P),  where  miP)  is  an  increasing  convex  function  of  P with  m( 0)  -xj, 
mil)  - x2. 

if  REFERENCES 

[11  Barlow,  R.,  and  F.  Proschan,  Statistical  Theory  of  Reliability  and  Life  Testing  Probability 
Models  (Holt,  Rinehart  A Winston,  New  York,  1977). 

[2]  Chow,  Y.S.,  and  H.  Robbins,  ’A  Martingale  Systems  Theorem  and  Applications,*  in 
Proceedings  of  the  4th  Berktley  Symposium  on  Mathematical  Statistics  and  Probability  Vol.  1 , 
pp.  93-104  (1961). 

[3]  DeOroot,  M.H.,  'Some  Problems  of  Optimal  Stopping,*  Journal  of  the  Royal  Statistical 
Society,  Series  B 30,  No.  1,  108-122  (1968). 


A 


I 


> 

I 


COMPUTATIONAL  RESULTS  WITH  A BRANCH-AND-BOUND 
ALGORITHM  FOR  THE  GENERAL  KNAPSACK  PROBLEM 

R.L.  Bulfin 

Department  of  Systems  and  Industrial  Engineering 
University  of  Arizona 

« Tucson,  Arizona 

R.G.  Parker  and  C.M.  Shetty 

4 

School  of  Industrial  and  Systems  Engineering 
Georgia  Institute  of  Technology 
Atlanta,  Georgia 

ABSTRACT 

In  (his  paper,  a branch-and-bound  procedure  is  presented  for  treating  the 
general  knapsack  problem.  The  fundamental  notion  of  the  procedure  involves 
a variation  of  traditional  branching  strategies  as  well  as  the  incorporation  of 
penalties  in  order  to  improve  bounds.  Substantial  computational  experience 
has  been  obtained,  the  results  of  which  would  indicate  the  feasibility  of  the  pro- 
cedure for  problems  of  large  size. 


INTRODUCTION 

The  general  knapsack  problem  considered  in  this  study  is 


n 


(1)  P: 

Maximize 

z - £ CjXj 
J-l 

(2) 

subject  to 

n 

E ajXj^b. 
j- 1 

(3) 

and 

0 

(4) 

Xj  integer. 

Without  loss  of  generality,  we  can  assume  ct,  aj  , and  b to  be  integers  and,  further,  that  ay  and 
b are  nonnegative  scalars  [6],  In  addition,  we  assume  that  the  indices  are  ordered  such  that 

Cj/fli  >c2/a2>.  . . >cjan. 


THE  APPROACH 

The  general  strategy  we  adopt  in  this  work  is  the  branch-and-bound  approach  of  Dakin  [3] 
which,  itself,  is  essentially  a modification  of  the  landmark  algorithm  of  Land  and  Doig  [7], 

41 


42 


R.  L.  BULFIN,  R.  G.  PARKER  AND  C.  M.  SHETTY 


Also,  the  strategy  we  pursue  incorporates  the  suggestion  of  Beale  and  Small  [2]  to  implement 
the  concept  of  penalities  in  order  to  improve  the  bounds  used  by  Dakin.  The  reader  will  find 
the  discussion  in  Sections  4. 1-4.3  of  Taha  [13]  and  in  Geoffrion  and  Marsten  [5]  useful  in 
understanding  the  terminology  and  concepts  used  in  this  paper. 

Node  Selection 

Given  a list  of  subproblems  (nodes)  with  integer  variables,  we  need  to  select  one  for 
further  exploration.  We  make  such  a decision  by  selecting  the  subproblem  which  possesses  a 
nonintegral  optimal  solution  and  has  the  largest  upper  bound.  Here,  the  upper  bound  Z for  a 
problem  CP  is  given  by 

(5)  Z - [Z(CP)-PT], 

where  Z(CP)  is  the  optimal  objective  value  of  problem  CP,  which  is  a linear  program  obtained 
from  problem  CP  by  relaxing  the  integrality  constraints,  [k]  denotes  the  largest  integer  not 
exceeding  k,  and  PT  is  the  penalty  for  imposing  integrality.  The  form  of  PT  has  been  derived 
by  Tomlin  [14]  and  can  be  given  in  explicit  form  as  follows:  Let  x* define  the  optimal  solution 
to  the  relaxed  problem  and  let  xj  be  the  fractional-valued  variable.  The  coefficients  in  the 
optimal  tableau  obtained  by  using  the  simplex  procedure  with  upper-bounded  variables  are 
given  by 

-ajap  ifx'  - 1, 
ajap  otherwise, 

* *— 

c,+cpa,  if  x'  — 1 , 

' ” -c,+c/a,  otherwise, 
and  b - xp.  The  penalty  PT  is  then  given  by 

*Vfj’  f,<b. 

(l-£)q/(l-/), 

where  / ; - ai-ni  for  some  integer  n,  and  0</,<l. 

Fathoming 

X 

Suppose  Z * is  the  value  of  the  current  best  solution  to  the  problem  (the  incumbent  solu- 
tion). For  a node  selected  with  associated  problem  CP,  the  nods  is  considered  fathomed  if  its 
relaxed  linear  program  CP  \s  infeasible,  has  the  upper  bound  Z<Z*.  or  has  an  integral  optimal 
solution.  In  the  last  instance,  if  Z>Z* , then  a new  incumbent  has  been  found. 

Branching 

Suppose  a node  and  the  corresponding  problem  CP  are  selected  for  exploration.  If  the 
fathoming  test  fails,  it  is  necessary  to  create  two  new  branches  (subproblems).  This  is  done  by 
choosing  a branching  variable  x,  and  suitably  changing  its  lower  or  upper  bound.  The  selection 
process  is  designed  so  that  the  bound  computed  from  (5)  is  as  small  as  possible  for  one  of  the 
problems,  and  hence  can  be  eliminated  early. 


Pt  - minimum 

iWp 


BRANCH-AND-BOUND  ALGORITHM  FOR  THE  GENERAL  KNAPSACK  PROBLEM 


43 


During  the  procedure,  the  values  of  certain  variables  will  be  fixed.  We  will  call  the 
remaining  variables  free  variables.  Selection  of  an  index  v from  among  the  free  variables  is  cen- 
tral to  the  branching  process,  and  its  determination  can  be  summarized  by  the  following  rule. 
Let  V|  be  the  smallest  index  of  the  free  variable  and  v2  the  largest  index  of  the  free  variable  in 
problem  CP.  Further,  let  us  consider  two  problems,  one  of  which  has  the  upper  bound  on  vari- 
able v,  decreased  by  one  and  the  other  of  which  has  the  lower  bound  on  variable  v2  incre- 
mented by  one.  Let  Z,  and  Z2  be  ttrc  bounds  obtained  by  (S)  for  these  problems.  If  Z,  >Z2, 
we  let  v - v>2.  On  the  other  hand,  if  Z|  < Z2  , we  let  v - v,.  Note  that  if  q is  the  index  of  the 
fractional-valued  variable  in  problem  CP  and  if  vt  - q,  then  v — v2.  Likewise,  if  v2  — q then 
v - v, . The  process  would  detect  a terminal  node  on  the  branch  if  Vj  — v2  — q. 

Once  the  index  v is  selected,  we  create  two  problems  CP\  and  CP2  with  integer  variables. 
We  will  denote  the  problems  obtained  by  relaxing  the  integrality  constraints  by  CP\  and  CP2 
respectively.  If  x * is  the  optimal  value  of  variable  x,  in  problem  CP,  the  relaxation  of  CP,  we 
obtain  problem  CP\  by  adding  the  constraint  x,  — x;  to  problem  CP,  so  that  x,  is  now  fixed. 
Hence,  the  solution  to  problem  CP  continues  to  be  optimal  to  problem  CPj . To  create  the 
second  subproblem  C'P2,  the  following  rule  is  adopted: 

(i)  If  v<q  (so  that  v - v,  ),  add  to  problem  CP  the  constraint^  < uv , where  u,  - x,_- 
1.  In  this  case,  x„  is  at  its  current  upper  bound  in  problem  CP , and  in  problem  CP2 
its  optimal  value  will  be  x;  - 1. 

(ii)  If  v>q  (so  that  v - v2  ),  add  to  problem  CP  the  constraint  x^> L, , where  £ - x,*  + 
1.  In  this  case,  x,  is  at  its  current  lower  bound  in  problem  CP,  and  in  problem  CP2 
its  optimal  value  will  be  x;  + 1. 

Initial  Solution 

Woolsey  and  Swanson  [IS]  have  given  a procedure,  referred  to  as  the  slippery  algorithm, 
which  yields  good  approximate  solutions  to  a 0-1  knapsack  problem.  A slight  modification  of 
this  scheme  to  treat  the  general  problem  is  self-evident,  and  it  was  used  to  obtain  starting  solu- 
tions in  this  study.  We  point  out,  however,  that  the  procedure  in  [IS]  is  based  upon  the  notion 
of  determining  a 'greedy”  solution  (see  Magazine,  et  al.  [8]). 

COMPUTATIONAL  ASPECTS 

Once  the  basic  notions  of  the  branch-and-bound  scheme  are  in  hand,  specific  compu- 
tational savings  can  be  highlighted.  For  example,  suppose  Zy  and  t2  are  computed  for  problem 
CP  in  order  to  select  a branching  variable  v.  If  v - v, , then  x,  is  fixed  in  CP( . To  select  a 
branching  variable  for  CP(,  the  free  variable  with  the  largest  index  is  still  v2. 
Hence,  the  bound  Z2  is  still  valid  if  v2  is  larger  than  the  index  of  the  fractional  variable  in  the 
solution  CP| . A similar  statement  can  be  made  if  v - v2  . 

The  reader  may  also  note  that  the  solution  to  problem  CP  can  be  used  in  solving  problem 
CP2.  Recall  that  CP  and  CP2  differ  only  by  an  upper  or  a lower  bound  on  one  variable.  Sup- 
pose v - V|,  so  that  if  u',  is  the  upper  bound  on  x,  in  problem  CP,  we  add  the  constraint 
x„  < u\  — 1 to  get  problem  CP2.  Then  the  optimal  values  of  variables  Xj  for  j < v are  the  same 
in  problems  CP  and  CP2.  For  variable  x,  the  optimal  value  is  u',  - 1 in  problem  CP2.  A simi- 
lar statement  can  be  made  when  v - v2. 


44 


R.  L.  BULFIN,  R.  O.  PARKER  AND  C.  M.  SHETTY 


For  testing,  the  branch-and-bound  procedure  discussed  above  was  coded  in  FOR- 
TRAN and  numerous  problems  were  run  on  a Cyber  74  system.  The  parameters  cy  and  a, 
were  randomly  generated  from  the  uniform  distributions  [1,  SO],  [1,  100],  and  [1,  999]. 
Within  each  of  these  classes,  the  values  of  uy  were  randomly  generated  over  11,5]  and 
[1,10].  The  only  exception  was  in  the  case  of  problems  with  n of  1000,  1500,  and  2000,  in 
which  a reduced  set  of  experiments  was  considered.  In  all  problem  classes,  the  knapsack 

n 

size  b was  randomly  generated  over  [a/3,2a/3],  where  a — £ ay  uy  . All  computational 

j- 1 

experience  is  reported  in  Tables  1-3.  All  times  are  in  c.p.u.  seconds  and  do  not  include  the 
time  for  ordering  ratios  cy/ay.  That  such  times  are  not  included  is  consistent  with  other 
studies  dealing  with  knapsack  problems. 

Five  random  problems  in  each  classification  were  run  to  determine  the  mean  values 
in  tables.  Certainly,  a rigorous  analysis  regarding  problem  behavior  relative  to  size  and 
other  parametric  variation  would  not  be  possible  with  such  sample  sizes.  Some  trends  are 
suggested,  however,  by  the  data  generated.  For  example,  it  would  seem  that  a more 
marked  impact  on  computing  effort  is  realized  by  increasing  the  ranges  on  parameters  cy 
and  aj  than  by,  say,  increasing  the  problem  size  within  a given  range.  Note,  for  example, 
that  the  mean  time  for  solving  1000  variable  problems  was  slightly  less  than  that  for  600 
variable  problems  where  the  latter  possessed  cy  and  oy  values  randomly  generated  over  [1, 
999],  and  the  former,  over  [1,  50]  (see  Tables  1 and  3 ).  In  both  cases  «,<5. 

An  expected  result  of  the  experiment  also  occurs  when  the  bound  t/y  is  increased.  In 
most  cases,  the  increase  fosters  a similar  increase  in  computing  effort.  Of  particular 
interest  here  is  the  number  of  nodes  generated.  Although  only  two  upper-bound  values  of 
Uj  were  considered  in  the  experiment  reported  in  the  tables,  numerous  other  problems 
were  solved  with  bound  values  as  great  as  30,  in  which  case  the  increase  in  computing 
effort  was  generally  observed  to  be  substantial  compared  to  those  cases  reported. 

FUTURE  RESEARCH 

Diagnostic  analysis  relative  to  the  composition  of  'difficult*  problems  would  be  an 
obvious  area  for  further  study.  Throughout  this  research,  it  was  clear  that  substantial  vari- 
ation in  computational  effort  could  occur  relative  to  changes  in  problem  parameters.  Fre- 
quently, such  parametric  variation  was  subtle.  For  example,  it  would  appear  from  the 
results  attained  in  this  study  that  a critical  dimension  in  problem  difficulty  is  the  range  of 
parameters  ay  and  cy  as  contrasted  to  an  increase  in  problem  size  n.  In  addition,  it  was 
found  that  by  judicious  choices  of  differing  distributions  for  ay  and  cy  values  particularly 
troublesome  problems  could  be  created.  Note  that  in  the  experiment  reported  values  of  ay 
and  Cj  were  always  generated  from  the  same  distribution. 

In  conclusion,  it  may  be  of  interest  to  note  that  the  thought  underlying  the  branching 
strategy  suggested  in  this  work  is  somewhat  akin  (although  independently  developed)  to 
that  expressed  by  Balas  and  Zemel  [1]  as  well  as  others  in  the  case  of  the  0-1  problem.  An 
interesting  element  in  these  works  is  the  suggestion  that  a 'core  problem*  exists  which 
involves  only  a subset  of  variables  whose  values  are  of  significance.  We  have  not 
attempted  to  address  this  notion  in  the  current  work  but  its  analysis  and/or  substantiation 
would  seem  to  hold  merit.  This  is  especially  important  in  the  light  of  particularly  impres- 
sive computational  results  arising  from  the  current  work  on  0-1  problems  [1,9,  12,  17], 


BRANCH- AND-BOUND  ALGORITHM  FOR  THE  GENERAL  KNAPSACK  PROBLEM 


45 


TABLE  1.  Summary  of  Computational  Experience, 
dj.Cjt  [1,50] 


Problem 
Size,  n 


500 

600 

700 

800 

900 

1000 

1500 

2000 


Uj€  [1,5] 


mean 

time 


0.510 

1.761 

0.567 

1.758 

3.538 

4.302 

1.464 

6.405 


max. 

time 


1.276 

2.156 
2.634 
3.590 

6.156 
5.214 
6.972 

16.917 


min. 

time 


0.047 

1.193 

0.020 

0.027 

0.022 

3.839 

0.053 

0.078 


mean* 

nodes 


199.4 
453.0 

139.8 

457.4 

764.8 

964.8 
248.6 
777.2 


max. 

nodes 


504 

627 

699 

838 

1179 

*1026 

1243 

2021 


«,€  [1,10] 


mean 

time 


0.919 

1.217 

1.403 

2.424 

0.708 

1.834 

6.994 

2.429 


max. 

time 


1.732 

2.622 

2.626 

5.097 

3.703 

5.408 

12.080 

11.730 


min. 

time 


0.019 

0.056 

0.026 

0.022 

0.023 

0.040 

0.040 

0.050 


mean* 

nodes 


302.2 

394.8 

406.8 

543.2 

175.8 
387.6 

1072.0 

326.2 


max. 

nodes 


508 

792 

701 

1086 

879 

1080 

1501 

1631 


'Average  value  of  the  maximum  number  of  nodes  ever  maintained  in  storage. 

TABLE  2.  Summary  of  Computational  Experience, 

aj.cjt  [1,100] 


Problem 

«y€[l,5] 

«y€  [1,10] 

mean 

max. 

min. 

mean 

max. 

max. 

min. 

mean 

max. 

Size,  n 

time 

time 

time 

nodes 

nodes 

| time 

time 

time 

nodes 

nodes 

500 

1.868 

3.744 

547 

755 

600 

2.667 

498.4 

640 

2.223 

1.944 

624 

700 

0.060 

IBB 

917 

627.0 

800 

9.115 

0.096 

■nil 

■d 

3.505 

5.519 

687.0 

891 

900 

2.176 

0.032 

402.8 

tea 

6.201 

7.792 

3.743 

962.0 

1057 

TABLE  3.  Summary  of  Computational  Experience, 
aJtCjZ  [1,999] 


Problem 
Size,  n 

iy€  [1,5] 

u 

y€  11.10] 

mean 

max. 

min. 

mean 

max. 

max. 

min. 

mean 

max. 

time 

time 

time 

nodes 

nodes 

time 

time 

time 

< nodes 

nodes 

500 

7.128 

2.129 

698.8 

9.415 

■ 

1154 

4.838 

12.138 

1.782 

723.6 

5.653 

715.6 

974 

700 

6.613 

3.971 

767.4 

6.825 

3.128 

861.6 

800 

19.922 

3.954 

973.4 

1269 

5.813 

9.069 

931.8 

1243 

| 21.421 

38.993 

5.665 

1260.4 

1568 

12.734 

24.678 

7.895 

1267 

46 


R.  L.  BULF1N,  R.  O.  PARKER  AND  C.  M.  SHETTY 


BIBLIOGRAPHY 

[1]  Balas,  E.,  and  E.  Zemel,  "Solving  Large  Knapsack  Problems,"  presented  at  the  Joint 

National  Meeting  of  ORS A/TIMS,  Miami,  Florida  (November,  1976). 

[2]  Beale,  E.M.L.,  and  R.E.  .Small,  "Mixed  Integer  Programming  by  a Branch-and-Bound 

Technique,"  Proc.  IFIP  Congress,  New  York,  May,  1965,  W.A.  Kalenich,  ed.  (Spartan 
Books,  Washington,  D.C.,  1965,  pp.  450-451. 

[3]  Dakin,  R.J.,  "A  Tree-Search  Algorithm  for  Mixed  Integer  Programming  Problems,"  Com- 

puter Journal,  8,  pp.  250-255,  (1965). 

(41  Garfinkel,  R.S.,  and  G.L.  Nemhauser,  Integer  Programming  (Wiley,  New  York,  1972).  * 

[5]  Geoffrion,  A.M.,  and  R.E.  Marsten,  "Integer  Programming:  A Framework  and  State-of- 

the-Art  Survey,"  Management  Science,  18,  465-491  (1972). 

[6]  Glover,  F.,  "A  Multiphase-Dual  Algorithm  for  the  Zero-One  Integer  Programming  Prob-  „ 

lem,"  Operations  Research,  13,  879-919  (1965). 

[7]  Land,  A.H.,  and  A.G.  Doig,  "An  Automatic  Method  for  Solving  Discrete  Programming 

Problems,"  Econometrica,  28,  497-520,  (1960). 

(81  Magazine,  M.J.,  G.L.  Nemhauser,  and  L.E.  Trotter,  Jr.,  "When  the  Greedy  Solution 
Solves  a Class  of  Knapsack  Problems,"  Operations  Research,  23,  207-217  (1975). 

[9]  Nauss,  R.M.,  "An  Efficient  Algorithm  for  the  0-1  Knapsack  Problem,"  Management  Sci- 
ence, 23,  27-31  (1976). 

[10]  Salkin,  H.M.,  Integer  Programming  (Addison-Wesley,  Reading,  Massachusetts,  1975). 

[Ill  Salkin,  H.M.,  and  C.A.  deKlayver,  "The  Knapsack  Problem:  A Survey,"  Naval  Research 
Logistics  Quarterly,  22  127-144  (1975). 

[12]  Suhl,  U.,  "An  Algorithm  and  Efficient  Data  Structures  for  the  Binary  Knapsack  Problem," 
Arbeitspapier  Nr.  20/1976,  Fachbereich  Wirtschaftswissenschaft,  Freie  Universitiit  Ber- 
lin (1977). 

[131  Taha,  H.A.,  Integer  Programming:  Theory,  Applications  and  Computations  (Academic  Press, 

New  York,  1975). 

[14]  Tomlin,  J.A.,  "An  Improved  Branch-and-Bound  Method  for  Integer  Programming,"  Opera- 

tions Research,  19,  1070-1074,  (1971). 

[15]  Woolsey,  R.E.D.,  and  H.S.  Swanson,  Operations  Research  for  Immediate  Application  — A 

Quick. and  Dirty  Manual,  (Harper  and  Row,  New  York,  1975). 

[16]  Zionts,  S.,  Linear  and  Integer  Programming  (Prentice-Hall,  Englewood  Cliffs,  New  Jersey, 

1973). 

[17]  Zoltners,  A.A.,  "A  Direct  Descent  Binary  Knapsack  Algorithm,"  Working  Paper  75-31, 

School  of  Business  Administration,  University  of  Massachusetts,  Amherst  (1975). 


r 


M/M/1  QUEUES  WITH  INTERDEPENDENT  ARRIVAL 
AND  SERVICE  PROCESSES 


C.  R.  Mitchell 

U.  S.  Air  Force  Academy 
Colorado  Springs,  Colorado 

A.  S.  Paulson 

Rensselaer  Polytechnic  Institute 
Troy,  New  York 


ABSTRACT 

We  study  via  simulation  an  M/M/1  queueing  system  with  the  assumption 
that  a customer's  service  time  and  the  interarrival  interval  separating  his  arrival 
from  that  of  his  predecessor  are  correlated  random  variables  having  a bivariate 
exponential  distribution.  We  show  that  positive  correlation  reduces  the  mean 
and  variance  of  the  total  waiting  time  and  that  negative  correlation  has  the  op- 
posite effect.  By  using  spectral  analysis  and  a nonparametric  test  applied  to  the 
sample  power  spectra  associated  with  certain  simulated  waiting  times  we  show 
the  effect  to  be  statistically  significant. 

INTRODUCTION 

In  M/M/1  queues  it  is  usually  assumed  that  a customer’s  service  time  is  independent  of 
the  time  interval  separating  his  arrival  from  that  of  his  predecessor.  Designating  the  difference 
in  arrival  epochs  of  customers  Cn.\  and  C„  by  T„  and  the  service  time  of  customer  C„  by  S„  we 
depart  from  the  usual  assumption  of  independence  and  take  the  pair  (T„.  SH)  to  be  a bivariate 
exponential  random  variable  (r.v.)  with  correlation  coefficient  p.  For  p “ 0 we  have  the  simple 
queueing  system  commonly  studied. 

Correlated  queues  of  this  nature  have  received  very  little  attention  in  the  literature.  Bhat 
[2]  describes  five  different  classes  of  single-server  queues  which  are  closer  to  real  systems  than 
they  would  be  with  assumptions  of  independence,  and  he  states  that  more  work  needs  to  be 
done  in  these  areas.  One  of  the  classes  is.  of  systems  with  interdependent  arrival  and  service 
processes,  as  assumed  here  for  the  pair*  (T,.  S„). 

Conolly  [3]  gives  the  waiting-time  distribution  and  its  moments  for  the  unusual  single- 
server system  where  arrivals  are  from  a Poisson  process  and  the  ratio  SJT„  is  constant  for  all  n. 
It  is  also  shown  that  this  pattern  of  server  behavior  results  in  a drastic  reduction  in  the  mean 
and  variance  of  the  waiting  times  as  compared  with  the  conventional  M/M/1  system.  Conolly 
refers  to  this  type  of  system  as  self-regulating,  and  other  results  are  given  by  him  and  Hadidi 
[6,7],  We  study  herein,  via  simulation,  this  type  of  system  where  the  dependence  between  the 
service  time  and  the  interarrival  interval  is  probabilistic.  Since  this  note  was  first  written. 


47 


48 


C.  R.  MITCHELL  AND  A.  S.  PAULSON 


Conolly  and  Choo  [4,5]  have  obtained  a substantial  number  of  results  of  an  analytical  nature 
regarding  these  self-regulating  systems.  Their  results  make  possible  an  exact  analysis  of  many 
aspects  of  the  queueing  system  we  introduce  in  this  note. 

2.  A BIVARIATE  EXPONENTIAL  DISTRIBUTION 


There  are  infinitely  many  ways  of  defining  a bivariate  exponential  distribution  with 
exponential  margins;  however,  only  a few  such  distributions  have  received  any  attention  in  the 
literature.  Because  of  its  many  attractive  mathematical  properties,  the  most  commonly  used 
bivariate  exponential  distribution  is  the  so-called  Wicksell-Kibble  distribution  [11,15].  We  drop 
the  subscript  n for  now  and  take  the  density  function  for  the  bivariate  random  variable  (T,  S) 
with  correlation  0 < p < 1 to  be 

(1)  fU.s)  - (1  - p)Kpe~k'~M‘  l0  [2(pA/isf)'/2], 


I 

where  / > 0,  s > 0,  and  /0(z) 
and  order  zero.  The  density  (1) 


(z/2)2k  . 


" *5  (A!)2 

has  mean  vector 


is  the  modified  Bessel  function  of  the  first  kind 


(2) 


Mr  (Xp')-‘ 
Ms]  “ Up')-1 


and  covariance  matrix 

_ M 

a rsj 

_ [ Mf 

PMrMs) 

® i-U. 

<*i  ] 

[PMrMs 

Ml  ]’ 

where  p'  - 1 - p.  The  marginal  distributions  of  T and  S are  exponential  with  means  pT  and 
Ms  respectively.  A generalization  of  (1)  of  Paulson  [14]  admits  of  correlation  values  -0.25  < p 
< 1.  It  is  readily  shown  that  the  pair  (T,  5)  with  density  (1)  has  a two-dimensional  Laplace 
transform  given  by 


(4)  *(«.  v)  - (1  - p)  [Jl  -I-  -^jjl  + 7)  ~ p]  • 

which  clearly  reduces  to  the  Laplace  transform  of  a pair  of  independent  exponential  variates 
when  p — 0.  The  density  (1)  does  not  admit  of  negative  correlations,  so  that  for  the  case  p < 
0 we  use  herein  the  generalization  of  Paulson  [14].  The  especially  simple  form  of  (4)  makes  it 
possible  to  obtain  analytical  results  which  with  other  choices  of  bivariate  exponential  distribu- 
tions would  be  hard  to  come  by. 


Whereas  there  is  only  one  exponential  distribution  in  one  dimension,  there  are  infinitely 
many  "potential”  exponential  distributions  in  higher  dimensions.  However,  since  we  would 
naturally  require  that  any  bivariate  exponential  distribution  have  a rational  Laplace  transform 
(such  as  (4»,  it  does  not  seem  possible  that  any  other  bivariate  exponential  distribution  would 
be  "much  different"  from  (1)  and  hence  the  use  of  (1)  in  the  study  of  self-regulating  queueing 
systems  should  lead  to  generic  results,  i.e.,  system  behavior  for  one  distribution  should  typify 
behavior  for  the  other  "potential"  distributions.  This  is  likely  to  be  true  even  for  the  singular 
Marshall-Olkin  distribution  [14]. 


We  have  restricted  this  note  to  considerations  involving  only  the  bivariate  exponential  dis- 
tribution; consideration  of  other  distributions  seems  patently  merited  since,  as  we  shall  see,  sys- 
tem behavior  is  so  drastically  affected  by  correlation.  A particularly  useful  generalization  of  the 
density  (1)  obtainable  through  the  preservation  (to  higher  dimensions)  scheme  of  Paulson  [14] 
has  the  Laplace  transform 


M/M/1  QUEUES  WITH  INTER-DEPENDENT  ARRIVAL  AND  SERVICE  PROCESSES 


49 


I a P -I 

1 + ^ ~ p , 

where  a>0,  /3>0,  0<p<  1,  and  X and  p are  arrival  and  service  parameters  of  the  distribution. 
The  density  g(t,s)  corresponding  to  (5)  reduces  to  (1)  when  a - 0 - 1.  One  margin  of  the 
density  g(t,s)  is  gamma  (Erlang)  with  shape  parameter  a,  the  other  margin  is  gamma  with 
shape  parameter  0,  and  the  variates  T and  S have  correlation  p.  This  density  would  be  useful 
in  the  study  of  queueing  systems  other  than  the  partially  correlated  M/M/1,  for  example,  par- 
tially correlated  M/E„/\  or  EJM/l.  Here  E„  denoted  the  Erlang  (gamma)  distribution  of  ord- 
ers n. 

The  variates  ( T.S ),  given  by  (1)  may  be  readily  simulated.  Let  [X,j,  J - 1,  2,  ...}  be 
independent  identically  distributed  (i.i.d.)  exponential  random  variables  with  means  X-1,  / — 1, 
and  p~\  i - 2.  Let  N denote  a geometric  r.v.  with  density  function 

(6)  P[N  - /)  - p'->  (1  - p),  i - 1,  2 

It  follows  that 

(7)  (T.S)-  £ Xxj , £*2, 

j- i J-\ 

has  the  bivariate  distribution  (1)  (see  112]).  The  simulation  proceeds  by  our  simulating  N and 
then  adding  the  number  of  i.i.d.  exponentials  according  to  (7).  Other  comments  about  this 
bivariate  r.v.  are  given  in  Downton  [8].  When  p< 0,  simulation  of  exponential  variates  is 
described  in  [12]. 

3.  SOME  RESULTS 


We  assume  that  customers  from  an  infinite  population  arrive  at  a single-server  queue 
according  to  a Poisson  process  with  rate  X;  an  unlimited  queue  is  allowed  and  the  service  discip- 
line is  first  come  — first  served  with  service  rate  p and  utilization  v - X/p.  The  sequence  of 
pairs  (T„,  S„)  is  assumed  to  be  i.i.d.,  that  is,  (T„,  S„)  is  independent  of  (Tm,  SJ  for  every 
m^n  and  all  n.  For  customer  C„  we  assume  that  the  r.v.  has  the  distribution  (1)  when 
0<p<1  and  Paulson’s  generalization  of  (1)  when  -0.25<p<0. 


The  quantity  is  defined  to 'be  the  total  waiting  time,  queueing  plus  service,  for  custo- 
mer C„  and  it  is  clear  that  a recursive  formula  for  H'„+1  is 


(8) 


\.K  - T„+l  + sH+l,  if  r,+l  < w„, 
k+l.  if  T„+l  > Wn 


for  all  n.  If  f(t,s)  is  the  joint  density  of  (T„,  Sn)  given  by  (l),  then  the  density  of  W„+l  is 
readily  found  to  be  (Conolly,  personal  communication,  Conolly  and  Choo  [4,5]) 

(9)  a,+1(w)  - Jo  a„(w)  f(t,s)  dtds  + ^ a„(w  + r - j)  /(/.  j)  dsdt 


with  corresponding  Laplace  transform 


(10) 


«»+i(z) 


/(z)q,(z)  - za„f(z) 
fU)  - z 


where 


T 


50 


C.  R.  MITCHELL  AND  A S.  FAULSON 


(11) 

and 

(12) 


<*|(r) 


J*£_ 


z + up 


m - i - 

z + H 


It  can  be  shown  that  lim  a„  (z)  - o(z)  where  <*(z)  satisfies  the  functional  equation 

(13)  «(z)  - r? 

(z  + HP  )(z  + ) 

which  becomes  on  repeated  iteration 

(14)  a(z) 


HP 

Z + H 

oo 

n 

y-i 

MP' 

Zj  + H 

z + HP' 

Z + hv' 

Zj  + HP' 

Zj  + H*>' 

where  Zj  - /(zj_,),  j - 1,  2 z0  - z,  v'  - 1 - v.  The  Laplace  transform  of  the  steady- 

state  waiting-time  distribution  is  a(z)  — o(z)/a(0).  Equations  (13)  and  (14)  contain  complete 
information  about  the  distribution  of  steady-state  waiting  time,  but  retrieval  of  this  information 
in  other  than  numerical  form  is  more  difficult  than  appearances  would  suggest.  However,  the 
mean  and  variance  may  be  easily  obtained  by  numerically  differentiating  a(z)  as  obtained  from 
(14). 

In  the  sequel  we  take,  without  loss  of  generality,  E[Tn]  — A — 1 and  £[£*]  — 
1/h  - v < 1.  For  T„  and  S„  independent,  as  is  normally  assumed,  the  mean  waiting  time  per 
customer,  in  steady  state,  is  v/(l  — v)  (13).  At  the  other  extreme,  for  p *“  1 Conolly  (3]  gives 
the  mean  waiting  time  for  the  case  S„  “ vT„  for  all  n.  Our  results  are  for  other  values  of  corre- 
lation. Next  we  use  (8)  to  show  simulated  results. 

In  Fig.  1 we  show  how  nonzero  correlation  affects  the  mean  waiting  time  for  v “ 0.70. 
For  zero  correlation  the  expected  waiting  time  iii  steady  state  is  2.333,  and  we  see  that  the 
simulated  result  after  25,000  service  completions  is  in  close  agreement  (2.323).  Conolly  shows 
for  his  system  (p  — 1)  that  the  expected  waiting  time  for  this  value  of  v is  1.427,  and  again  the 
agreement  is  very  good  (1.421).  From  the  figure  it  is  clear  that  each  graph  tends  to  stabilize  for 
increasing  n in  accordance  with  the  law  of  large  numbers.  Thus,  for  positive  correlation  we 
have  a benefit  in  system  performance  in  that  the  mean  waiting  time  decreases,  and  for  negative 
correlation  the  process  is  degraded.  (Starting  conditions  for  the  simulations  were  based  on  pilot 
runs.) 

Figure  2 shows  the  ratio  of  mean  waiting  time  in  the  system,  for  various  values  of  p,  to 
the  expected  waiting  time  in  the  system  with  p — 0,  we  see  that  the  effect  of  nonzero  correla- 
tion is  greatest  for  large  utilizations.  The  solid  lines  depict  a smooth  fit  to  the  actual  data. 
Sampling  variation,  of  course,  precludes  the  possibility  of  obtaining  such  a smooth  fit  without 
extremely  long  runs  or  extensive  replications. 

Figure  3 shows  the  standard  deviation  of  the  simulated  waiting-time  process  for  several 
values  of  the  correlation  coefficient  p and  for  utilization  equal  to  0.7.  For  zero  correlation  the 
expected  standard  deviation  in  steady  state  is  2.333;  the  simulated  value  is  2.023.  Conolly 
shows  for  p - 1 that  the  expected  standard  deviation  is  0.783;  the  simulated  value  is  0.779. 


M/M/1  QUEUES  WITH  INTER  DEPENDENT  ARRIVAL  AND  SERVICE  PROCESSES 


51 


FIGURE  1.  Mean  waiting  timea,  v — 0.7 


i 


J 

1 


( 


\ 


The  mean  and  standard  deviation  of  waiting  time  in  the  system  can  be  computed  from 
a(z)/a( 0)  as  given  by  (14).  The  agreement  between  simulated  values  and  exact  numerical 
values  is  reasonably  good  in  all  cases.  The  greatest  variability  occurs  for  low  and  negative 
values  of  correlation.  The  least  variability  is  evident  for  high  values  of  correlation.  There  is  a 
spectacular  reduction  in  mean  waiting  time  for  high  utilization  and  high  correlation,  but  the 
reduction  in  the  standard  deviation  is  even  greater.  In  any  further  simulations  of  this  type,  spe- 
cial care  must  be  taken  with  low  values  of  correlation,  as  is  indicated  by  comparison  of  our 
simulated  results  with  the  exact  results. 

When  we  initiated  this  research  the  question  of  whether  correlation  between  interarrival 
and  service  times  had  any  effect  on  system  behavior  was  moot.  Exact  expressions  would  have 
been  ideal,  but  they  were,  unfortunately,  lacking.  The  resolution  of  the  question  was  effected 
through  the  use  of  spectral  analytic  techniques.  Conolly  and  Choo  [4,5]  have  since  succeeded 
in  providing  an  elegant  and  broader  analysis  of  the  partially  correlated  queueing  system  we  have 
considered  here. 

4.  SPECTRAL  ANALYSIS  OF  { Wj 

In  this  section  we  review  briefly  the  theory  of  spectral  analysis,  show  sample  power  spec- 
tra of  { W„)  for  different  values  of  p,  and,  Anally,  apply  a nonparametric  test  to  the  ratio  of  cer- 
tain estimated  power  spectra. 


Several  authors  give  complete  accounts  of  the  theory  and  applications  of  spectral  analysis; 
e.g.,  see  Anderson  [1],  and  Jenkins  [10].  Fishman  and  Kiviat’s  [9]  paper  on  the  analysis  of 
simulation  generated  time  series  is  also  of  direct  interest.  The  application  of  spectral  analytic 


I 

% 


I 


*? 


C.  R MITCHELL  AND  A.  S.  PAULSON 


-3  -2  -1  Q .1  .2  .3  .4  .5  .6  .7  .8  .9  1.0  p 


CORRELATION 

FIGURE  2.  Ratio  of  mean  waiting  time  at  p * 0 
to  expected  waiting  time  tip  “0 

techniques  to  the  waiting-time  process  for  a tandem  queueing  system  is  discussed  by  Mitchell, 
et  al.  [12]  and  the  application  here  is  identical. 

The  { fV„,  n — 1,  2 A/)  will  be  a realization  of  a stochastic  process  with  mean  n and 

autocovariances  yk,  k - 0,  1 A study  of  a time  series  in  terms  of  its  autocovariances  is 

referred  to  as  a time-domain  analysis.  Another  type  of  analysis,  spectral  analysis,  is  concerned 
, with  the  frequency  content  of  the  time  series.  The  Fourier  cosine  transform  of  the  autocovari- 
ances y0<  7i>  "tv  • ••  is  called  the  power  spectrum. 

Denoting  the  power  spectrum  by  f(u),  we  can  write 

(15)  /(«)*•  2[y0+2  £ yk  cos  lirok],  0 < at  < 

*-i  1 


and  we  typically  estimate  /(o'  with  the  truncated  estimate 


\ 


M/M/1  QUEUES  WITH  INTER-DEPENDENT  ARRIVAL  AND  SERVICE  PROCESSES 


3.047  p 


2.023  p = 0 


CUSTOMERS  (IN  THOUSANDS) 

FIGURE  3.  Standard  deviation  of  waiting  times,  v — 0.7 


/(o»y)  — 2(X0c0  + 2 £ X* ck  cos  2 it  otjk], 

*-i 


where  otj  - J/(2m),  j -.0,  1,  2,  ...  , m,  the  weights  X*,  k — 0,  1,  2,  ...  , m,  form  a so- 
called  lag  window,  and  the  ck,  k - 0,  1,  2,  ...  , m,  are  the  sample  estimates  of  the  autocovari- 
ances. We  choose  the  Blackman-Tukey  ’hamming"  window  given  by 

(17)  X*  — 0.54  + 0.46  cos  irk/m,  k — 0,  1,2,  ....  m, 
and  for  the  estimates  of  the  autocoveriances  we  use 

(18)  c*-i  I*  ( Wn  - W)(  W„k  - W), 

n »-i 

where 

(19)  W-±-  £ W„ 

n <i-i 

and  N is  the  number  of  observations  in  the  time  series  ( H',).  In  (16),  the  sample  autocovari- 
ances c„+|,  cm+1.  ...  are  omitted  since,  for  m sufficiently  large,  they  should  contribute  little 


54 


C.  R.  MITCHELL  AND  A.  S.  PAULSON 


information.  As  a result,  only  m autocovariances  need  be  calculated,  and  savings  in  computa- 
tion may  be  significant.  Considerable  care  must  be  used  when  selecting  m , however,  because 
too  large  a value  will  increase  the  variance  of  the  estimates  and  too  small  a value  will  not  give 
enough  resolution. 

Next  we  examine  sample  power  spectra  associated  with  the  different  simulated  waiting 
time  processes  and  test  the  hypothesis  /0(tu)  “ /p(o>),  0 < to  < y,  where  /0(cu)  is  the  power 
spectrum  associated  with  the  waiting  time  process  at  p — 0 and,  likewise,  /p(<u)  corresponds  to 

p ^ 0. 


Figure  4 shows  a portion  of  the  sample  power  spectrum  for  p — 0,  0.50,  and  1.0;  utiliza- 
tion is  0.70.  The  sample  spectra  were  formed  by  2000  waiting  times  from  the  end  of  a simula- 
tion run  of  length  25,000.  After  several  pilot  runs  were  made,  m in  equation  (16)  was  set 
equal  to  400.  In  Figure  4 it  is  obvious  that  the  waiting  times  for  positive  correlations  give  rise 
to  different  spectra  than  the  waiting  times  for  p — 0. 


025  .050  .075  .100  w 

FREQUENCY 


FIGURE  4.  A portion  of  wiitini  time  spectre,  v - 0.7 


mmrstm 


M/M/l  QUEUES  WITH  INTER  DEPENDENT  ARRIVAL  AND  SERVICE  PROCESSES 


55 


r 


Since  the  integral  of  the  power  spectrum  measures  the  variance  of  the  process  [1,10]  and 
the  area  under  the  sample  spectrum  should  be  indicative  of  the  sample  variance,  the  illustrated 
graphs  in  Figure  4 show  that  the  variance  of  the  waiting  times  decreases  with  positive  correla- 
tion. In  fact,  the  corresponding  simulated  waiting-time  series  have  variances  <r  as  follows:  p — 
0,  a — 4.763;  p — 0.50,  a — 2.141;  and  p — 1.0,  <r  *■  0.514.  (The  expected  variance  for  p 
— 0 is  5.444  [13]  and  Conolly’s  model  for  p — 1.0  gives  rise  to  a variance  of  0.613.)  Coupled 
with  the  simulated  result  that  p — -0.25  leads  to  a variance  of  10.451,  we  see  that  the  effect  of 
positive  correlation  is  to  reduce  the  variance  of  the  waiting-time  process  and  that  negative 
correlation  causes  an  increase. 

The  401  spectral  estimates  of  the  power  spectrum  in  (16)  are  not  independent,  and  so  we 
employ  the  notion  of  equivalent  independent  estimates.  As  developed  in  [10,12],  we  take  spec- 
tral estimates  at  the  frequencies  j/(2m),  J - 1,  4,  7 to  be  approximately  independent 

and  regard  the  ratio  f0(<n)/fp(<t>)  to  be  a Bernoulli  trial  (greater  than  unity  or  less  than  unity). 
Under  a null  hypothesis  of  homogeneity  of  the  two  spectra  /0(tu)  and  /p(<u),  we  can  take  as  a 
test  statistic  the  number  of  ratios  less  than  unity. 

Figure  5 shows  the  ratio  for  p — 0.5  and  v — 0.7.  In  the  figure,  22  of  the  134  approxi- 
mately independent  ratios  are  less  than  unity,  which  is  very  strong  evidence  that  the  null 
hypothesis  is  false;  for  other  values  of  p the  number  of  ratios  less  than  unity  are:  87  for  p — 
-0.25,  43  for  p — 0.25,  and  none  for  p — 1.0.  Additionally,  we  applied  the  test  to  /02 5(«)  and 
/o  5o(oj)  and  found  38  of  the  134  approximately  independent  ratios  to  be  less  than  unity.  We 
reject  the  implied  null  hypothesis  in  all  cases  and  conclude  that  the  waiting-time  process  as  a 
function  of  p leads  to  different  power  spectra.  Similar  results  for  other  values  of  utilization  and 
correlation  obtain. 

flu)  at  p=0 


I,  y 


T 


56 


C.  R.  MITCHELL  AND  A.  S.  PAULSON 


The  investigation  into  the  behavior  of  correlated  queues  is  in  its  infancy.  Even  though  an 
analytical  analysis  of  this  particular  partially  correlated  queue  has  been  obtained,  it  is  highly 
probable  that  behavior  of  different  queueing  systems  will  be  discovered,  as  here,  by  simulation. 

ACKNOWLEDGMENT 

We  are  grateful  to  Professor  B.  W.  Conolly  for  allowing  us  to  make  use  of  as-yet  unpub- 
lished results  on  exact  system  behavior  and  his  helpful  comments  and  suggestions. 

REFERENCES 

111  Anderson,  T.  W.,  The  Statistical  Analysis  of  Time  Series , (Wiley,  New  York,  1971). 

[21  Bhat,  U.  N.,  "Queueing  Systems  with  First-Order  Dependence,"  Opsearch,  6,  1-24  (1969). 

[31  Conolly,  B.  W.,  "The  Waiting  Time  Process  for  a Certain  Correlated  Queue,"  Operations 
Research,  16,  1006-101S  (1968). 

[41  Conolly,  B.  W.,  and  Q.  H.  Choo,  "The  Waiting  Time  Process  for  a Generalized  Correlated 
Queue  with  Exponential  Demand  and  Service,"  submitted  for  publication  (1977). 

[51  Conolly,  B.  W.,  and  Q.  H.  Choo,  "A  Partially  Correlated  Queue,"  Research  Report  77/1, 
Mathematics  Department,  Chelsea  College,  University  of  London,  London,  England. 

[6]  Conolly,  B.  W.,  and  N.  Hadidi,  "A  Correlated  Queue,"  Journal  of  Applied  Probability,  6, 
122-136  (1969). 

[71  Conolly,  B.  W.,  and  N.  Hadidi,  "A  Comparison  of  the  Operational  Features  of  Conven- 
tional Queues  with  a Self-Regulating  System,"  Applied  Statistics,  18,  41-53  (1974). 

[8]  Downton,  F.,  "Bivariate  Exponential  Distributions  in  Reliability  Theory,"  Journal  of  the 
Royal  Statistical  Society,  Series  B,  32,  408-417  (1970). 

[91  Fishman,  G.  S.,  and  P.  J.  Kiviat,  "The  Analysis  of  Simulation-Generated  Time  Series," 
Management  Science,  13,  525-557  (1967). 

[101  Jenkins,  G.  M.,  "General  Considerations  in  the  Analysis  of  Spectra,"  Technometrics,  3, 
133-136  (1961). 

[Ill  Kibble,  W.  F.,  "A  Two-Variate  Gamma  Type  Distribution,"  Sankhya,  5,  137-150  (1941). 
[121  Mitchell,  C.  R.,  A.  S.  Paulson,  and  C.  A.  Beswick,  "The  Effect  of  Correlated  Exponential 
Service  Times  on  Single  Server  Tandem  Queues,"  Naval  Research  Logistics  Quarterly, 
24,95-112  (1977). 

[13J  Morse,  P.  M.,  Queues,  Inventories  and  Maintenance,  (Wiley,  New  York,  (1958)). 

[14]  Paulson,  A.  S.,  "A  Characterization  of  the  Exponential  Distribution  and  a Bivariate  Ex- 
ponential Distribution,”  Sankhya,  35,  69-78  (1973). 

[151  Wicksell,  S.  D.,  "On  Correlation  Functions  of  Type  III,"  Biometrika,  25,  121-133  (1933). 


. 


CONFIDENCE  INTERVALS  RELATED  TO  SEQUENTIAL 
TEST  FOR  THE  EXPONENTIAL  DISTRIBUTION* 

D.  Siegmund 
ABSTRACT 

One-sided  sequential  tests  for  the  mean  of  an  exponential  distribution  are 
proposed,  and  the  related  confidence  intervals  are  computed.  The  tests  behave 
like  the  classical  sequential  probability-ration  test  when  the  mean  is  small  and 
like  a fixed-time  test  when  the  mean  is  large  and  accurate  estimation  is  impor- 
tant. 


1.  INTRODUCTION 

An  important  example  of  hypothesis  testing  for  the  purpose  of  decision  making  is  the 
quality  control  of  industrial  production,  for  which  the  required  decision  frequently  is  accep- 
tance or  rejection  of  a production  lot.  In  such  cases  sequential  testing  ofTers  certain  advantages 
over  fixed-sample-size  tests.  Generally  speaking,  sequential  tests  of  the  same  Type  I and  Type 
Ii  error  probabilities  (producer’s  risk  and  consumer’s  risk)  require  on  the  average  fewer  data  to 
reach  a decision  than  do  their  fixed-sample  counterparts,  and  this  shorter  (average)  testing  time 
is  usually  economically  desirable. 

There  may  be  cases  in  which  one  would  like  to  supplement  the  decision  provided  by  a test 
of  a hypothesis  with  a point  or  interval  estimate  for  some  parameter.  An  example  is  the  testing 
of  a prototype  with  the  goal  of  deciding  whether  to  start  production  or  to  continue  develop- 
ment. In  the  case  in  which  a decision  to  start  production  is  made,  it  may  be  advisable  to  have 
reliability  estimates  which  are  accurate  enough  to  permit  subsequent  comparisons  of  the  field 
reliability  of  production  items  with  the  laboratory  reliability  of  the  prototypes.  In  such  cases  the 
advantages  of  sequential  tests  are  not  so  apparent,  for  the  early  termination  of  a sequential  test, 
which  seems  advantageous  from  a decision-making  point  of  view,  may  result  in  insufficient'data 
for  accurate  estimation. 

The  goal  of  this  note  is  to  show  by  a concrete  example  that  sequential  tests  can  be  adapted 
so  that  some  of  their  advantages  are  maintained  while  their  concomitant  disadvantages  are 
minimized  from  the  viewpoint  of  estimation. 

2.  A SEQUENTIAL  TEST  AND  THE  RELATED  CONFIDENCE  INTERVALS 

Assume  that  the  times  to  failure  of  nominally  identical  items  form  a sequence  xt  x2. ... 
of  independent  exponentially  distributed  random  variables  with  megn  time  between  failures 
(mean  lifetime)  equal  to  9.  Suppose  that  to  determine  the  acceptability  of  these  items  one 


‘Prepared  under  Contract  N00014-77-C-0306  (NR-042-373)  for  the  Office  of  Nevel  Research 


57 


58 


D.  SIEGMUND 


desires  to  test  the  hypothesis  H0:  9 ^ 0O  vs  Hx:  9 < 9 , with  some  prespecified  error  probabili- 
ties a - Pt~»a  (reject  H0)  and  0 - (accept  H0).  If  H0  is  rejected,  the  items  are  deemed 
unacceptable,  and  thus  a denotes  the  producer’s  risk.  Similarly,  0 is  the  consumer’s  risk.  By  a 
change  of  scale  it  may  be  assumed  the  — 1,  and  indeed  it  will  be  in  the  rest  of  this  note.  To 
simplify  the  discussion  it  will  be  convenient  to  assume  that  one  item  is  put  on  test,  replaced, 
when  it  fails,  by  a second  item,  etc.  until  the  test  is  terminated. 

Let  - x,  + . . . + x;  denote  the  time  of  the  n th  failure.  The  customary  fixed-time  test 
of  H0  against  //,  rejects  H0  if  and  only  if,  for  appropriately  chosen  m0  and  t0,  smo  < t0.  The 
test  parameters  m0  and  t0  are  chosen  to  give  the  desired  values  of  a and  0.  It  is  sometimes 
more  convenient  to  describe  the  test  in  terms  of  X(t),  the  number  of  failures  prior  to  time  /, 
which  is  a Poisson  process  of  intensity  X — 0-1.  The  rejection  region  is  given  in  terms  of  X(t) 
by  X(to)  > m0.  Operationally,  the  test  is  usually  censored,  i.e.,  one  observes  the  process  of 
failures  until  the  m0th  failure  or  for  t0  units  of  time,  whichever  occurs  first.  Censoring  has  no 
effect  on  the  accept-reject  decision  and  hence  no  effect  on  the  error  probabilities.  It  does,  how- 
ever, mean  that  the  actual  time  on  test  is  a random  variable  < t0.  Nevertheless,  the  term 
"fixed-time  test"  will  be  used  to  describe  the  censored  version  in  order  to  distinguish  it  easily 
from  the  truly  sequential  tests  described  below.  It  is  important  to  note  that  the  confidence 
intervals  obtained  from  a (censored)  fixed-time  test  are  not  in  general  the  same  as  from  a true 
fixed-time  test  unless  an  accept  decision  is  reached. 

Suppose  now  that  in  addition  to  testing  Hq  one  desires  to  give  a confidence  interval  for 
the  mean  time  between  failures  9 (or  equivalently  for  the  Poisson  intensity  X ).  Suppose  also 
that  this  confidence  interval  is  of  primary  interest  when  H0  is  accepted.  When  H0  is  rejected, 
further  development  and  testing  will  be  necessary;  so  it  is  more  important  to  reach  a reject  deci- 
sion as  soon  as  possible  than  to  provide  an  extremely  accurate  estimate. 

The  following  test  of  H0  against  //,  is  designed  to  perform  like  a sequential  probability- 
ratio  test  when  //,  is  true  and  early  termination  is  desired  and  like  a fixed-length  test  when  H0 
is  true  and  accurate  estimation  is  of  prfcnary  concern.  Let  /(/)  - X(t)  logfX^Xo)  - (Xi  — X^ )/ 
be  the  log  iikelihood  ratio  for  testing  the  .simple  hypothesis  X - Xo(-fl0_1)  against 
X - X|(«*0f'  — 1),  Define  the  stopping  rule,  T — smallest  value  to  t such  that  lit)  ^ c.  Given 
stop  testing  at  min  (T,t\)  and  reject  H0  if  and  only  if  T < f|.  It  is  convenient  to  put 
a - c/log(X|/X0)  and  b - (Xi  - Xol/logfX^Xo),  so  that 

(1)  T — smallest  value  of  t such  that  X(t)  > a + bi 

(see  Figure  1).  The  Type  I and  Type  II  error  probabilities  are  { T<  /, ) and  /’x_1(7’>f1(. 
The  parameters  a and  r,  should  be  chosen  to  make  these  equal  to  the  desired  a and  0. 

Computation  of  the  error  probabilities  for  this  test  is  more  complicated  than  for  the  cus- 
tomary fixed-length  test.  A method  which  gives  adequate  approximations  for  practical  purposes 
is  described  in  the  appendix.  The  resulting  approximations  to  the  Type  II  and  Type  I error  pro- 
babilities are  respectively 

(2)  /*, ( 7*> ) = P\  {*„>/,}  - (X’Vxr^1  explr,  (X7x"-l)lFxA.  {sM>/1) 
and 

(3)  = X£((h-Xo )/(l-b)-k0b~i  Pi[T>tl)). 

In  equation  (2),  X'  and  X"  are  defined  by 

(4)  X'f,  - a + btt  and  b - (X’-X")/log(x7X")  . 


Y 


CONFIDENCE  INTERVALS  FOR  THE  EXPONENTIAL  DISTRIBUTION 


REJECT 


ACCEPT 


CONTINUATION  REGION 


FIGURE  1.  Stopping  rule  T and  sequential  test. 

and  m is  the  smallest  integer  > a + bt\.  Also,  Pk  denotes  probability  when  the  true  intensity 
of  X(t ) is  X,  so  Pk{s„  > f| ) is  the  probability  that  a x2  random  variable  with  2m  degrees  of 
freedom  exceeds  2\r,.  To  attain  given  values  a - /\0{r<r,}  and  /3  - Px[T>tx)  using  (2)  and 

(3),  first  set  P,{r>f,}  equal  to  the  desired  value  /3  on  the  right-hand  side  of' (3)  and  then 
solve  for  a in  terms  of  a and  /3.  The  value  of  t\  may  easily  be  determined  from  (2)  by  trial  and 
error.  Table  1 compares  two  (censored)  fixed-length  tests  with  the  corresponding  sequential 
tests.  For  both  examples  a — /3  — 0.10.  The  sequential  tests  require  on  the  average  about  30- 
percent  less  testing  time  than  do  the  fixed-length  tests  when  X - 1,  and  it  is  desirable  to 
minimize  testing  time.  They  pay  for  this  by  requiring  about  30-percent  more  maximum  testing 
time.  The  maximum  testing  time  is  attained  whenever  H0  is  accepted,  and  in  this  case  one 
desires  to  have  an  accurate  estimate  of  X.  The  fact  that  more  time  on  test  generally  leads  to 
more  accurate  estimation  of  X tends  to  minimize  the  disadvantage  of  a larger  maximum  test 
duration.  Nevertheless,  a modified  sequential  test  in  which  the  maximum  time  on  test  is  the 
same  as  in  the  fixed-length  test  will  be  described  in  Section  3. 


TABLE  1.  Comparison  of  (censored)  fixed-length  and  sequential  tests 


Fixed  Length 


luential 


Fixed  Length 


Test 

Parameters 


Expected  Testing 
Time  When  X - 1 


60 


D.  SIEGMUND 


To  compute  confidence  intervals  for  X based  on  observation  of  X(t)  for 
0 < t < min(r,f|),  it  is  convenient  to  define  several  auxiliary  functions.  For  fixed 
0 < y < 1/2  let 

X,(/)  - sup[X:  PjT£t)>y]  , 

X2(n)  - suplX:  Px{T>tu  XOl)^n)>y]  , 

X,(f)  - infix:  /,x{T<tJ>y]  , 
and 


Xj(n)  - infix:  + Pk[T>tx.  X(tl)>n)^y]  . 


It  is  easy  to  see  the  X,  > X,  (/— 1, 2).  Moreover,  by  an  argument  similar  to  that  given  by  Sieg- 
mund  {5]  for  a similar  problem,  it  may  be  shown  that 


X,(T)  if  T</, 
XjlJTfr,)]  if  r > t, 


is  a (1  - y)  100%  upper  confidence  bound  for  X and 


(5) 


(6) 


X 


jx,(n  ifr<t, 

(xj  !*(/,)]  ifr>/, 


(6) 


is  a (1  - y ) 100%  lows?  confidence  bound  for  X.  Also  [X,Xl  is  a (1  - 2 y)  100%  confidence 
interval. 


Computation  of  the  probabilities  entering  into  the  definitions  of  X and  X involves  the 
same  problems  as  computation  of  the  error  probabilities,  and  it  is  discussed  in  detail  in  the 
appendix.  For  many  of  the  cases  of  greatest  interest,  i.e.,  when  H0  is  accepted,  X and  A are 
almost  the  same  as  the  customary  confidence  limits  based  on  the  same  data.  For  example,  sup- 
pose that  the  sequential  test  defined  above  terminated  at  time  t,  with  X(t,)  - n.  Then  finding 
an  upper  confidence  bound  for  X involves  computing 

(7)  Px(r>f,,  *(/,)<«}  - PxU(l|K«l  - FX(T <t|,  *(»,)<»)  . (7) 

If  a fixed-time  test  were  to  lead  to  this  same  data,  the  appropriate  (1  - y)  100%  upper 
confidence  bound  for  X would  be  X*t-T(fi)l,  where  X *(«)  - supik:  Px(^(f1)<«}>y].  When- 
ever jhe  second  probability  on  the  right-hand  side  of  (7)  is  small  compared  to  the  first  probabil- 
ity, X2(n)  will  be  about  equal  to  X *(n),  and  an  appropriate  confidence  bound  based  on  the 
sequential  test  will  be  about  the  same  as  the  usual  confidence  bound.  This  is  presumably  the 
case  unless  n is  quite  close  to  a + bt ,.  As  an  example,  consider  the  first  sequential  test 
described  in  Table  1,  for  which  a + btt  - 20,  and  suppose  that  X(t,)  - 17.  The  usual  80- 
percent  confidence  interval^  for  X would  be  10.50,  0.945),  while  the  sequential-test-based 
confidence  interval  is  {X.Xj  — (0.47,0.92).  Although  the  usual  interval  is  based  on  the 
hypothesis  that  a fixed-length  test  wu  being  used  and  is  not,  strictly  speaking,  correct  under  the 
present  conditions,  it  is  probably  accurate  enough  for  practical  purposes.  For  X(tt)  < 17  it  will 
b«  even  more  accurate.  If  T < rt , there  is  no  reason  to  believe  that  the  usual  confidence  inter- 
val based  on  the  same  data  will  approximate  the  interval  [X.X]  defined  in  (5)  and  (6). 

3.  OTHER  SEQUENTIAL  TESTS 

The  sequential  test  suggested  above  has  one  important  disadvantage  vis  a vis  the  cus- 
tomary fixed-length  test,  to  wit,  its  maximum  test  duration  is  greater.  One  can  argue  that  this 
is  actually  an  advantage,  because  the  maximum  test  time  is  attained  when  H0  is  accepted  and 


CONFIDENCE  INTERVALS  FOR  THE  EXPONENTIAL  DISTRIBUTION 


61 


accurate  estimation  of  X is  of  primary  importance.  Nevertheless,  it  is  interesting  to  note  that 
one  can  find  tests  similar  in  spirit  to  the  sequential  test  of  Section  2 with  no  increase  in  the 
maximum  testing  time  over  the  usual  fixed-length  tests. 

Suppose  that  to  and  mg  define  a fixed-length  test  as  in  Section  2 and  let  the  stopping  rule 
T be  defined  by  (1)  with  b — — (1  — Ao  )/log  Xq  as  before,  but  with  a to  be  determined.  Con- 
sider the  class  of  sequential  tests  which  stop  testing  at  min(r,<o ),  and  for  some  m0  • < a + btg 
reject  Hq  if  and  only  if  T ^ t0  or  T > t0  and  X (% ) ^ m0  *.  The  Type  I and  Type'  II  error  pro- 
babilities are,  respectively, 

P + P Xo{7’>/0,  XUq)  ^ «o*) 
and 

Pl[T>t0,  X(lg)  < fug*}  , 


and  the  two  free  test  parameters  a amd  m0  * are  to  be  chosen  to  make  these  error  probabilities 
equal  to  the  desired  a and  /9.  Approximate  computation  of  these  probabilities  presents  essen- 
tially the  same  problems  as  the  determination  of  the  confidence  intervals  of  the  preceding  sec- 
tion and  may  be  handled  by  the  methods  of  the  appendix.  Unfortunately,  a trial  and  error 
determination  of  a and  m0  * is  a little  more  complicated  than  determination  of  a and  /,  in  Sec- 
tion 2.  Nevertheless,  it  can  be  accomplished  fairly  quickly  by  starting  from  the  values  of  a for 
the  sequential  test  and  m0  for  the  fixed  length  test  in  Section  2 and  increasing  both  slightly  to 
obtain  trial  values  of  a and  m0  * 

Corresponding  to  the  first  test  in  Table  1 (A<,  -1/2,  a-/ 3-0.1,  and  <b  — 18.9),  for  a - 3 
and  m0*  - IS  the  test  defined  above  has  approximate  error  probabilities  a — 0.099  and  P — 
0.109,  which  are  probably  close  enough  to  their  nominal  values  for  practical  purposes.  The 
expected  testing  time  when  X — 1 is  about  9.5,  which  is  only  slightly  larger  than  that  for  the 
sequential  test  of  Section  2 and  still  considerably  smaller  than  that  for  the  fixed-time  test. 
Computation  of  confidence  intervals  for  this  test  is  exactly  the  same  as  for  the  sequential  test  of 
Section  2.  It  is  interesting  to  note  that  whenever  both  this  test  and  the  fixed-time  test  accept 
Hq  (i.e.,  T>i0  and  X(tg)  < m0)  the  confidence  limits  for  the  sequential  test  are  shifted  slightly 
towards  smaller  values  of  X (larger  • values  of  0 ) than  the  customary  intervals,  but  the 
difference  is  probably  not  important  for  practical  purposes. 

Until  now  it  has  been  assumed  that  the  desire  to  estimate  X accurately  when  H0  is  true  is 
of  primary  consideration  and  that  early  termination  of  the  test  under  H0  is  relatively  unimpor- 
tant. Hence,  the  proposed  sequential  tests  have  been  designed  to  behave  like  a sequential 
probability-ratio  test  under  //,  and  like  a fixed-time  test  under  H0.  It  is  undoubtedly  possible 
to  obtain  somewhat  earlier  termination  for  small  values  of  X without  an  appreciable  loss  of  esti- 
mation accuracy,  but  whether  the  benefits  are  worth  the  additional  complication  remains  for  a 
future  study  to  decide. 

One  possibility  is  to  introduce  a lower  stopping  boundary  -c+bt,  so  that  testing  ter- 
minates if  X(t)  > a + bi,  AT(/)  < - c + 6/,  or  / — r2  for  some  suitably  chosen  a,  c,  and  r2- 
This  stopping  rule  is  similar  to  the  usual  (truncated)  sequential  probability-ratio  test.  However, 
by  choosing  c fairly  large,  one  does  not  reduce  the  sampling  time  under  Hq  too  much  and  accu- 
rate estimation  is  still  possible.  Confidence  intervals  may  be  defined  similarly  to  thosd  above, 
and  it  is  hoped  that  the  approximate  computational  methods  of  the  appendix  may  be  adapted  to 
this  case  as  well. 


62 


D.  S1EGMUND 


APPENDIX 

The  following  theorem  is  the  theoretics!  basis  for  (2)  and  the  calculations  required  to 
determine  X,  (/- 1,2).  It  is  related  to  Proposition  1 of  Siegmund  15]  (see  also  Woodroofe  16]). 
The  approximation  (3)  is  based  on  a heuristic  argument  which  is  given  following  the  proof  of 
Theorem  1.  An  approximation  to  the  expected  sample  size  for  X > X'  is  suggested  at  the  end 
of  this  appendix. 

THEOREM  1:  Let  T be  defined  by  (1).  Let  m denote  the  least  integer  > a + btx. 
Assume  that  a— ►«>  and  r,  — in  such  a way  that  A - m - (a  + b/()  remains  constant  and  for 
some  X'  > b 

(8)  X’f,  -a+bii  . 

Define  X"  <6  by 

(9)  b - (X'  -X'')/log(X'/X")  . 

Then  for  each  X >0,  k - 1,2,  ....  as  a — * «>, 

Pxlr<t,.X(/I)-m-M  ~ (X"/X')a+*'  explXt1(X7X"-l)]Pxx7x  U(t1)-m-X)  . 

PROOF:  By  the  definition  of  conditional  probability, 

(10)  PAT<tXl  *(/,)-«-*]  - Px{r<tI|T(f,)-m-X}PxU(/I)-m-X}  . 

It  follows  from  Lemmas  1 and  3 below  that  the  conditional  probability  in  (10)  converges  to 
(X”/X')*-4  as  a — ► oo.  The  theorem  follows  by  substitution  of  the  values  of  Px(X(f1)-m-k). 

LEMMA  1:  Px  { r< /,  | AT(r,)— m— A J does  not  depend  on  X,  and  under  the  conditions  of 
Theorem  1 converges  to  Px{AT(f)  < — X+A+bf  for  some  r >0}  as  a — » <*>. 

PROOF:  That  PA{7,<  /, | JlTf/j)— m — A } does  pot  depend  on  X is  an  immediate  conse- 
quence of  the  sufficiency  of  X(tx).  Hence,  it  suffices  to  give  the  proof  for  some  particular 
value  of  X.  Also,  for  0 < r < tx 

(11)  PxU(/)>a+bf,  for  some  ] Jr(r,)— m— A} < /\ { F < /, |AT(f,)— m— A) 

<PX{7’)  <tx-r\X(.tx)-m-k)+PK[XOi)>a+bt  for  some  — r < / < r,  |A'(/,)-m-A}. 

By  symmetry  the  first  probability  in  (11)  equals 

PK[X(tx)-X(t)  < -k+l+bUi-t)  for  some  /,  -r<t  <tx\X(tx)  - m-k) 

- PxUT/)  < -X+A+br  for  some  0 < t < r|3T(f,)  -m-k]  . 

It  is  easy  to  see  by  direct  calculation  that,  conditional  on  X(t|)  — m — k ~ X'tj,  the  process 
X(t),  0 < t <r,  converges  to  a Poisson  process  of  intensity  X'.  Hence,  the  preceding  condi- 
tional probability  converges  to 

Px  {Jir(r)  < - X+A+  bt  for  some  0 < t < r)  . 

Since  r is  arbitrary,  to  complete  the  proof  it  suffices,  by  (11),  to  show  that 

(12)  Px(r<f,-r|3r(/1)-m-*}  < c(r)  , 

where  «(r)  does  not  depend  on  a (or  r,)  and  converges  to  0 as  r -*  «>.  First  observe  that 


CONFIDENCE  INTERVALS  FOR  THE  EXPONENTIAL  DISTRIBUTION 


63  -* 


PxU(/,)-m-/c)  - explU'— X)f1](A/A')m-*/>x  {X(tx)-m-k}  , 

and  by  the  local  limit  theorem  and  (8) 

Px  [XU0-m-k}  ~ (2ir 

Hence,  to  prove  (12)  it  suffices  to  show  for  some  X > 0 

Px[T^lx  -r)  < «(/•)  expI(A'-A)/i](A/A')'"~* m~l/2  . 

The  proof  of  (13)  is  simplified  notationally  if  we  assume  that  m - a + btx  and  take  r to  be 
such  that  br  - j is  an  integer.  The  general  case  requires  only  slight  changes.  Then 

r)  - PK{XU)^a+bt  for  some 
-PK[n^a+b  for  some  n^m-j] 

< Z ^k^A-'U-fl)!-  £ Px{Xlb-Hn-a)]>r, } 

- Z Z exp[(X'  - \)b~l  (n  — a)] (A/X'V  TYtffo-'fn-a)]  - ,}  . 

It  may  be  shown  for  all  t < tx  and  / > a — by  standard  large  deviation  theory  for  t < «r, 
(Bahadur  and  Rao  [1]  ) and  by  a local  limit  theorem  for  «f]  < t < r,  — that  Px  {3T(f ) - /) 
< Cm- 1/2 . Hence,  for  all  X < X’ 

PAT^h-r)^ C (1-X/X  )-1  m-‘/2exp[(X-X') 6"1  a]  £ .p[(A'-A) b~l) (X/X') J" 

a < if  < m — J 

- C(1 — X/X')— 1 1/2 (X/X')m— ^ 

• expKx-A'M-fj+ir1./)]  £ {exp  [ (X'-X)  6“ 1 ] (X/X) }- 

a<n <m— j 

By  (9),  for  X"  < X <A',  exp[ (A A) 1 ] (A/A ')  > 1,  and  hence  the  indicated  sum  above  can  be 
extended  as  a convergent  series  to  n - - Comparing  the  resulting  inequality  with  (13)  and 
recalling  the  j - br,  we  can  complete  the  proof. 

The  following  lemmas  are  known,  but  for  lack  of  a convenient  reference  and  for  com- 
pleteness their  proofs  are  sketched. 

LEMMA  2:  Let  p|,p2  be  positive  numbers  and  r a stopping  rule  for  XU).  0 < t < «>. 
Then  for  arbitrary  t > 0 

(14)  expU^-MiMWAij)*^’^,  • 


PROOF:  Let  />x(,>  denote  the  restriction  of  Pk  to  the  space  of  X(s),  0 < s < t,  and  let 
Z(r)  - exp((/t2  -ni)t](,fii/n2)x(l)  be  the  likelihood  ratio  of  P^  relative  to  P Then 


')  -X„  1 2«> 


dPL 


Since  Z(r),  0 < r < <»,  is  a PMj  - martingale, 

1 - £„|Z[mta(,.,)l)  - fk<il  ZMdF„  + X>„  ZO)-*’.,  • 

and  (14)  follows  by  subtraction: 


64  H.  SIEGMUND 

LEMMA  3:  For  each  x > 0 and  A'  > b 

< -x+bt  for  some  f >0)  - (X''/A')x  , 

where  X"  <6  is  defined  by  (9). 

PROOF:  Let  t - inflr.  X(t)  < -x+6t],  and  observe  that  with  probability  1 

(15)  X(r)  - - x + bt  on  (r<oo)  . 

Letting  f — ► <»  in  (14)  and  appealing  to  (9)  and  (15)  we  obtain 

/V(T<oo}-  exp{(X"-X')rl(X7X")Jf<T)<//,X" 

- (X7X')*  /»x-{t<oo)  . 

k.  J 

Since  X"  <b , <oo)  - 1,  which  completes  the  proof. 

COROLLARY  1:  For  X > X'  and  k - 1,2 under  the  conditions  of  Theorem  1 

(16)  Pk[T<tu  X(ti)^m-k)  ~ (X-'/X')*^' 

/• exp[Xf,(xyx''-l)]f>xx7r{Y(/,)<m-X}  . 

PROOF:  Obviously  for  all  y >0 

(17)  m-k-JKXUO^m-k)  < Pk{T<tu  Y(f,)<m-A:} 

< Px[T<lu  m-k-J<X(ti)Km-k)  + Pk[X(ti)  ^m-k-j)  . 

For  fixed  J the  probability  on  the  left  and  the  first  probability  on  the  right  of  (17)  may  be 
evaluated  asymptotically  by  appealing  to  Theorem  1.  The  second  term  or  the  right  hand  side  of 

(17)  may  be  shown  to  be  of  smaller  order  of  magnitude  for  large  J by  standard  large-deviation 
arguments.  See  Bahadur  and  Rao  (1)  or  Siegmund  [4]  for  the  details  of  similar  arguments. 

REMARKS:  (i)  Corollary  1 may  be  used  to  evaluate  approximately  the  probability  in  (2) 
and  the  probabilities  appearing  in  the  definitions  of  X,  by  virtue  of  the  relation 

Pjr>tlt  Jr(l,K»}  - />,(*(/,)<«}  - PklT<tt.  X(t,)<n]  . 

The  approximation  can  be  expected  to  yield  accurate  results  only  when  n is  close  to  a 4-  btx, 
i.e.,  n - m — k for  a fairly  small  k,  and  X > X'.  But  these  are  essentially  the  only  values  of 
interest,  as  was  observed  in  Section  2. 

(ii)  A slightly  more  sophisticated  argument  similar  to  that  given  by  Siegmund  (5]  shows 
that  (16)  holds  for  all  X > X",  A A'. 

(iii)  It  is  tempting  to  evaluate  the  probability  appearing  in  the  definition  of  X2(n)  by 
rewriting  it  as 

Px[XUx)>n)  + Y(r,)</i) 

and  then  applying  Corollary  1.  However,  this  evaluation  is  of  interest  for  small  X when  Corol- 
lary 1 is  not  applicable. 

Now  suppose  that  X0  < X”,  and  let  A|  > X'  be  defined  by 

(18)  (A|— Ao)  ■ Mog(A|/Ao)  . 


CONFIDENCE  INTERVALS  FOR  THE  EXPONENTIAL  DISTRIBUTION  65 

These  values  Xo  and  Xi  may  be  the  values  specified  by  the  test  of  the  hypothesis  of  Section  2 
but  need  not  be.  The  approximation  (3)  and  an  approximate  evaluation  of  the  probabilities 
appearing  in  the  definitions  of  X,  0-1,2)  may  be  obtained  if  we  write 

(19)  /gr*/,}  - /\#{r<oo)  - PXo{r,  < r<oo} 

and  give  separate  approximations  for  the  two  terms  on  the  right  hand  side  of  (19).  Letting 
1 ' * 00  *n  Lemma  2 (note  the  /,Xi(r<ooj  — i ) and  appealing  to  (18)  we  obtain  the  representa- 
tions 

- (Xo/X,)°  ^|[(Xo/X,)jr(n-‘*n 
and 

(20)  />x#{/,<r<oo)  - (Xo/X,)*  ) £Xl[(Xo/x,)*(”-‘-*H*(fi)]dPXl . 

The  limiting  value  of  £,x1[(X0/X1)-r(r)_o_!’']  as  a — is  given  in  Lemma  4 below  leading  to 
the  asymptotic  relation 

(2D  ^\0{7’<°°}  — [(ft-XoV^-iJUXo/X!)0  as  a — . 


The  integral  in  (20)  may  be  rewritten  as 


(22)  £ X(t\)“m—k)  £x  [(X0/XI)Jr<7*_4)-*+A_47*-Al  , 

where  Tx  — inf [r : / >0,  X(t)  ^x+bt\  and  A “m  — ( a+bti ) as  before.  To  obtain  a simple 
approximation  to  (22),  one  might  replace  the  expectations  there  by  their  limits  as  k — oo,  or 
one  might  replace  all  these  expectations  by  that  for  T0,  which  is  also  given  in  Lemma  4.  The 
latter  alternative  seems  more  attractive  for  two  reasons:  (a)  the  values  of  Pk 
|7’>/,,  XU^-m-k)  are  larger  for  smaller  values  of  k,  and  (b)  using  k - 0 rather  than  k 
infinitely  large  we  obtain  a smaller  approximation  to  (22)  and  hence  a larger  (more  conserva- 
tive) approximation  to  Replacing  the  Tk_x  by  T0  in  (22)  leads,  by  Lemma  4 

below,  to  the  approximation 

*\0{'i<7’<°o}  * x06-,/>x|{r>/1)(x0/x1)« . 

Substituting  this  result  and  (21)  into  (19),  we  obtain 

= (Xo/X,)"  1(6— Xo)/(Xi~ 6)— X06~,/>x|l2’>f,)l  , 

which  is  (3)  in  the  special  case  XI  - 1. 

LEMMA  4:  For  Xq  and  Xt  defined  by  (18) 

Jim  £X|  [(Xo  /X1)m«>— **■•]  - (6-Xo)/(x,-6) 
and 

KXo  /X1)‘r<r°>— *r#  1 — Xo/6  . 

PROOF:  Let  rfl  - inf[»:  n-bs„  >a),  so  X{Ta)  - ra  and  Ta  - jr<.  Also,  let 
r_  - inf  [n:  n-bs„  < 01.  By  Lemma  2 and  random  walk  theory  (Feller  [21,  Chapter  XII), 

V(A o/X,)*<r#,~"#l  - Fx.fT’o  < oo)  - /-Fx#{r0-oo} 

- 1 - \/EKo[t_]  - 1 - (1  - b^/E^lr.  - bsr  J - 1 + (l-ftXo-^/ftXo-'  - Xo  /b  . 


66 


D.  S1EGMUND 


By  considering  the  renewal  process  defined  by  r0  — bsT()  , the  renewal  theorem  gives  the  £Xl 
limiting  distribution  of  ra  - bsTa  — X(Ta)  — bTa  as  a — °°  (Feller  [2],  Chapter  XI).  Com- 
bined with  (18)  and  random-walk  theory  (Feller  [2],  Chapter  XII),  this  leads  to 

lim  £Xl[(X0A|)*<r“>  * “ I'm  expt-6_l(X,-X0)(’'fl-6sTii  )1 

a—oo  ~ O—  oo  1 

- {£x,  [r0-bsTo])-'  txpl-b-'iXi-X^P^  [tq—  bsr()  >x)dx 

- [(1  — 6/X|)£x,  tq]  1 6 (Xj  — Xq)  1 £\0  {to**00} 

- 6(X,  - Xo )-« (1  - 6/X,)-1  (£Xor_)-‘  [r_  -oo) 

- 6(X,  - Xo )"' (1  - 6/X])-1  (1  — Xo  /b) (1  - Xo /X,) 

- (f>  — Xo)/(X]  — b)  . 


There  remains  the  task  of  computing  approximations  to  £x  [min (£,/])],  especially  for 
X > 1 . With  the  help  of  Theorem  1 , this  becomes  an  easy  matter.  It  is  easy  to  see  that  for 
X > b 

(23)  £x [min (7^)1  - £X[71  - J[r>|i,  £X[T  - f,|*0,)]<//>x  . 

The  usual  Wald  approximation  to  £TX  C 7”!  obtained  by  using  Wald's  identity  and  ignoring  the 
excess  X(T)  - ( a+bD  is 

(24)  £x[r]=fl/(X-6)  . 

The  same  argument  gives  the  approximation 

(25)  E^[T  — t\\X(t\)  •“  m — k]  ~ (*-A)/(X-6)  . 

Hence  the  second  term  on  the  right-hand  side  of  (23)  may  be  rewritten 

££x[r-/,|jr(f1)-m-X][£x{jr(t,)-m-X)  - £x{7’<f„  X(tx)-m-k)] 

k- 1 

and  evaluated  approximately  with  the  help  of  (25)  and  Theorem  1 to  obtain 
/|r>()£A[r-r1|Af(f1)]d/>x=(X-6)-1[(fl+6/i)/,x[r>/i) 

(26)  -X/,7\k_,  > /.l-KX'yxr-'  X /, exp{X tx (X'/X"-l) )£xx/x. k-1  > ', )]  • 

Substituting  (24)  and  (26)  into  (23)  gives  an  approximation  to  £x[min(r,fj)].  It  should  be 
fairly  accurate,  at  least  whenever  X>  X',  so  />x(r>/1)  is  small  and  the  second  term  on  the  right 
hand  side  of  (23)  is  small  compared  with  the  first.  It  is  possible  to  approximate  the  expected 
excess  over  the  boundary  EK[X(D  -a  -671,  end  hence  give  a slightly  more  accurate  approxi- 
mation. However,  the  increase  in  accuracy  is  rather  modest  in  contrast  to  the  problem  of 
approximating  error  probabilities,  where  the  excess  over  the  boundary  is  more  important  (see 
Siegmund  [3]). 


I 

r 

i 


CONFIDENCE  INTERVALS  FOR  THE  EXPONENTIAL  DISTRIBUTION  67 

REFERENCES 

[1]  Bahadur,  R.R.,  and  Rao,  R.  Ranga,  "On  Deviations  of  the  Sample  Mean,”  Annals  of 
Mathematical  Statistics,  31,  1015-1027  (1960). 

[2]  Feller,  W.,  An  Introduction  to  Probability  Theory  and  Its  Applications,  Vol.  II  (Wiley,  New 
York,  1966). 

[3]  Siegmund,  D.,  "Error  Probabilities  and  Average  Sample  Number  of  the  Sequential  Probabil- 

ity Ratio  Test,"  Journal  of  the  Royal  Statistical  Society,  Ser.  B 37,  394-401  (1975). 

[4]  Siegmund,  D.,  (1975b).  "Large  deviation  probabilities  in  the  strong  law  of  large  numbers,” 

Zeitschrift  fir  Wahrscheinlichkeitstheorie  und  Verwendte  Gebiete  31,  107-113  (1975). 

(51  Siegmund,  D.  "Estimation  Following  Sequential  Tests"  Biometrika  65,  341-349  (1978). 

[6]  Woodroofe,  M.,  "A  Renewal  Theorem  for  Curved  Bound*  's  and  Moments  of  First  Pas- 
sage Times,"  Annals  of  Probability,  4,  67-80  (1976). 


! 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 
FOR  LARGE  COMBINATORIAL  PROBLEMS 


Bruce  L.  Golden  and  Frank  B.  Alt 

College  of  Business  and  Management 
University  of  Maryland 
College  Park,  Maryland 


ABSTRACT 

Consider  an  "intractable"  optimization  problem  Tor  which  no  efficient  solu- 
tion technique  exists.  Given  a systematic  procedure  for  generating  independent 
heuristic  solutions,  we  seek  to  obtain  interval  estimates  for  the  globally  optimal 
solution  using  statistical  inference.  In  previous  work,  accurate  point  estimates 
have  been  derived.  Determining  interval  estimates,  however,  is  a considerably 
more  difficult  task.  In  this  paper,  we  develop  straightforward  procedures  which 
compute  confidence  intervals  efficiently  in  order  to  evaluate  heuristic  solutions 
and  assess  deviations  from  optimality.  The  strategy  presented  is  applicable  to  a 
host  of  combi,  atorial  optimization  problems.  The  assumptions  of  our  model, 
along  with  computational  experience,  are  discussed. 

INTRODUCTION 

Many  problems  in  the  area  of  combinatorial  optimization  are  known  to  be  NP-complete. 
Any  current  algorithm  for  obtaining  the  exact  solution  to  a problem  in  this  class  is  an  exponen- 
tial time  algorithm.  That  is,  in  the  worst  case,  running  time  increases  exponentially  as  a func- 
tion of  an  input  parameter,  such  as  the  number  of  nodes.  Moreover,  evidence  suggests  that 
polynomial  time  algorithms  do  not  exist  for  any  NP-complete  problem  (see  Karp  [12]  for  more 
details). 

With  this  in  mind,  simple  and  efficient  heuristic  techniques  which  yield  near-optimal  solu- 
tions consistently  become  essential.  At  the  same  time,  tools  need  to  be  developed  and  shar- 
pened in  order  to  measure  the  accuracy  of  these  heuristics.  Much  recent  research  has  been 
directed  at  generating  performance  guarantees  for  heuristic  algorithms  which  specify  a priori 
upper  bounds  (a  priori  in  the  sense  that  the  bounds  are  independent  of  the  data)  on  percentage 
deviations  from  optimality.  Garey  and  Johnson  [S]  and  Sahni  [23]  survey  combinatorial 
approximation  algorithms  which  are  guaranteed  to  find  solutions  that  are  "close"  to  optimal.  In 
earlier  work,  Held  and  Karp  [10,11]  had  shown  how  to  derive  sharp  data-dependent  bounds 
using  duality  theory  and  Lagrangean  relaxation. 

In  this  paper,  we  approach  the  problem  of  analyzing  heuristic  solutions  from  a different 
perspective.  Although  we  focus  on  the  infamous  traveling  salesman  problem  (TSP),  the 
approach  applies  to  other  NP-complete  problems  as  well.  The  methods  we  discuss  attempt  to 
bring  statistical  inference  to  bear  in  order  to  provide  data-dependent  tight  lower  (upper)  bounds 
for  combinatorial  minimization  (maximization)  problems.  These  bounds  have  probabilities 
associated  with  them  but  are  easier  to  calculate  than  bounds  obtained  via  Lagrangean  relaxation. 


69 


70 


B.  L.  GOLDEN  AND  F.  B ALT 


V 


In  particular,  given  a systematic  procedure  for  generating  independent  heuristic  solutions, 
we  seek  to  obtain  interval  estimates  for  the  globally  optimal  solution  using  statistical  inference. 
In  recent  work  by  Golden  16,7),  Dannenbring  13),  and  Klein  113]  accurate  point  estimates  have 
been  derived.  Determining  interval  estimates,  however,  is  a considerably  more  difficult  task. 
The  key  objective  of  this  paper  is  to  develop  straightforward  procedures  which  compute 
confidence  intervals  efficiently  in  order  to  evaluate  heuristic  solutions  and  assess  deviations 
from  optimality. 

Define  a network  G - IN,  A.  C 1 with  N the  set  of  nodes,  A the  set  of  arcs,  and  C - [c„] 
the  matrix  of  costs.  The  TSP  requires  that  we  find  the  Hamiltonian  cycle  of  G of  minimal  total 
cost.  The  interval  estimation  procedure  developed  will  be  applied  to  the  TSP  and  will  use  the 
2-opt  heuristic  algorithm  discussed  by  Lin  (15)  for  generating  solutions. 

Given  a randomly  constructed  starting  tour,  the  2-opt  algorithm  checks  each  pair  of  arcs 
to  see  if  it  can  be  replaced  by  another  pair  of  arcs  not  yet  in  the  tour  so  as  to  produce  a new 
tour  of  reduced  cost.  The  procedure  continues  until  no  further  replacements  can  be  performed. 
The  random  starting  tours  are  easily  generated  with  a computer  program  from  Nijenhuis  and 
Wilf  (20)  and  the  entire  process  can  be  repeated  any  number  of  times. 

STATISTICAL  BACKGROUND 

Suppose  we  take  S independent  samples,  each  of  size  m,  from  a parent  population  which 
is  bounded  from  below  by  a.  If  x,  is  the  smallest  value  in  sample  /,  then  let 

(1)  v - min  Uj : 1 < i < S). 

Fisher  and  Tippett  [4]  proved  that  as  m gets  large  the  distribution  of  x,  approaches  a Weibull 
distribution  with  a as  the  location  parameter  (see  Gumbel  [8]  for  more  details).  The  cumula- 
tive Weibull  distribution  is  given  by 

(2)  F,(x0)  " prob  [x  < x0)  - 1 - exp 

for  x0  ^ a > 0,  b > 0,  c > 0, 


where  a is  the  location  parameter,  b is  the  scale  parameter,  and  c is  the  shape  parameter. 

McRoberts  (18)  and  Golden  [6,7]  appeal  to  this  fundamental  result  in  statistical  extreme- 
value  distribution  theory  in  order  to  derive  point  estimates  for  the  global  optimum. 
Specifically,  in  [7j  we  argue  that  if  G has  n nodes,  then  the  parent  population  consists  of 
(m-1)!/2  tours  with  total  costs  bounded  from  below  by  a,  the  length  of  the  optimal  tour.  Each 

heuristic  solution  x,  (/  — 1,  2 S)  is  a local  minimum  relative  to  the  "swapping"  operation 

from  a large  number  m of  possible  tours.  Furthermore,  these  heuristic  solutions  are  indepen- 
dent since  the  initial  tours  are  randomly  chosen. 

Good  quick  estimates  for  the  three  parameters  can  be  determined  from  a least- 
squares/goodness-of-fit  analysis,  as  discussed  by  Golden  [7],  One  can  also  solve  the 
maximum-likelihood  equations  using  one  of  the  methods  recommended  by  Zanakis  [26].  We 
remark  briefly  that  Dannenbring  [3]  and  Klein  [13]  have  also  approached  the  point  estimation 
problem  but  from  different  viewpoints.  Dannenbring  studies  an  estimate  derived  by  Robson 
and  Whitlock  [22]  which  applies  to  any  distribution  with  a left  truncation  poinf.  Klein’s 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 


71 


approach  involves  using  a three-parameter  power-function  distribution  (a  special  case  of  the 
beta  distribution). 

' 

In  order  to  verify  the  argument  that  the  heuristic  solutions  are  Weibull  distributed,  we 
computed  Kolmogorov-Smirnov  statistics  (assuming  independence)  in  our  point  estimation 
experiments  [7],  In  all  cases,  these  statistics  fell  far  below  the  critical  value  at  the  0.0S  level  of 
significance,  indicating  that  there  is  no  reason  to  reject  the  Weibull  hypothesis.  In  addition,  we 
observed  that  the  point  estimate  was  always  within  2.8%  of  the  true  optimal  solution. 

Next,  we  sought  to  confirm  the  intuitive  reasoning  that  heuristic  solutions  are  indepen- 
dent. One  set  of  fifty  2-opt  solutions  to  a 100-node  TSP  was  randomly  selected,  and  the  theory 
of  runs  was  applied  to  this  data.  We  describe  two  statistical  tests.  Additional  tests  were  per- 
• formed;  all  had  the  same  outcome. 

The  first  test  of  randomness  is  based  on  the  total  number  of  runs.  This  test  dichotomizes 
the  sequence  of  SO  observations  by  classifying  each  observation  as  either  above  or  below  the 
sample  median.  The  number  of  runs  above  and  below  the  median  is  listed  in  Table  1. 


TABLE  1.  Test  of  Randomness 


Run  Length 

Number  of  Runs 
Above  the  Median 

Number  of  Runs 
Below  the  Median 

1 

9 

5 

2 

0 

5 

3 

3 

2 

4 

0 

1 

5 

0 

0 

6 

0 

0 

7 

1 

0 

Total 

13 

13 

Thus,  the  total  number  of  runs  is  26.  Swed  and  Eisenhart’s  tables  [25]  show  that  the  limiting 
values  for  the  total  number  of  runs  above  and  below  the  median  are  16  at  the  0.005  level  of 
significance  and  19  at  the  0.05  level  of  significance.  Hence,  there  is  no  indication  that  the  total 
number  of  runs  above  and  below  the  median  is  smaller  than  would  occur  in  a random 
sequence. 

The  next  statistical  test  of  randomness  is  based  on  the  length  of  the  longest  run  on  either 
side  of  the  median.  Inspection  reveals  that  the  length  of  the  longest  run  is  7.  Now,  Mosteller’s 
table  [19]  gives  the  limiting  values  at  the  0.05  and  0.01  significance  levels  as  10  and  11,  respec- 
tively. Therefore,  the  length  of  the  longest  run  does  not  give  any  reason  to  reject  the 
hypothesis  of  randomness. 

In  summary,  the  run  tests  of  randomness  demonstrate  that  the  2-opt  solutions  are 
independent.  The  Kolmogorov-Smirov  test  convinces  us  that  the  2-opt  solutions  are  distributed 
according  to  a Weibull  distribution.  The  effect  of  these  statistical  tests  is  to  give  increased  cred- 
ibility to  the  validity  of  point  and  interval  estimation  in  this  setting. 


✓ 


* 


72 


B.  L.  GOLDEN  AND  F.  B.  ALT 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 


After  point  estimation,  the  next  step  is  to  develop  a procedure  for  determining  a 
confidence  interval  for  a,  the  optimal  TSP  solution.  This  task  is  considerably  more  difficult 
than  obtaining  a point  estimate  but  potentially  far  more  rewarding.  If  we  could  say,  for  exam- 
ple, that  with  probability  0.99  the  optimal  solution  is  contained  in  the  interval  [975,1000]  for  a 
particular  problem,  then  we  would  have  an  invaluable  tool  for  evaluating  heuristic  solutions  and 
measuring  deviations  from  optimality.  In  this  section,  we  discuss  several  candidate  interval- 
estimation  techniques.  Only  one  of  these  performs  successfully.  We  present  explanations  for 
the  failure  or  success  of  each  technique.  In  addition,  various  computational  experiments  are 
discussed. 


Clough  [2],  using  Monte  Carlo  methods,  applied  extreme- value  theory  to  the  problem  of 
estimating  the  global  optimum  of  a function  of  several  variables  subject  to  a set  of  constraints; 
he  developed  confidence  statements  as  well.  Unfortunately,  his  results  pertain  only  to  the  very 
restricted  instance  of  the  Weibull  distribution  where  c — 1 (an  exponential  distribution).  If  we 
can  fit  the  heuristic  solutions  to  an  exponential  distribution  then,  upon  manipulating  equation 
(37)  in  Clough  [2],  it  can  be  shown  that 


(3) 


Prob 


S + h 

1 + h 

1 

[s  - 1 

v — 

S - 1 

s 

s 

Z */ 

/-I 


< a < v\ 


- 1 - 


S - 1 


S.+  h 


s- 1 


't 


V 


for  arbitrary  values  of  h. 

Not  surprisingly,  this  confidence  interval  almost  always  fails  to  contain  the  optimal  solu- 
tion. This  follows  since,  as  we  will  see  later,  the  2-opt  solutions  tend  to  have  Weibull  distribu- 
tions with  shape  parameters  of  approximately  2.  Setting  c — 1 beforehand  invalidates  the 
Fisher-Tippett  result  on  which  this  model  relies. 

Assuming  that  the  distribution  of  x,  is,  approximately,  a three-parameter  Weibull  distribu- 
tion, Mann  et  al.  in  section  5.2.3(d)  of  their  excellent  book  [17]  demonstrate  an  interval  esti- 
mation procedure  which  depends  on  a beta  distribution  approximation.  In  experiments  on  eight 
TSP’s,  only  five  of  the  eight  resulting  90%  confidence  intervals  included  the  optimum.  A possi- 
ble explanation  for  these  lackluster  statistics  is  the  fact  that  two  approximations  (the  Weibull 
approximation  and  then  the  beta  approximation)  are  needed  in  order  to  construct  confidence 
intervals.  The  approximation  errors  are  very  likely  cumulative. 

Let  x(/)  be  the  irt  order  statistic  (then  v - x(l)).  Robson  and  Whitlock  [22]  introduced 
an  exact  100(1  - a)%  confidence  statement  on  a left  truncation  point  for  a uniform  distribution 
and  argued  that  this  gives  an  approximate  confidence  interval  for  other  density  functions  as 
well.  This  approximate  confidence  interval  is  given  by 

- v)  < a < vl  = 1 - a. 


If  we  set  a equal  to  0.5,  the  lower  confidence  limit  becomes  the  point  estimate  tested  by  Dan- 
nenbring. 


(4) 


Prob  tv  - 


1 -■ 


(x 


(2) 


I 


We  generated  fifty  2-opt  solutions  to  the  five  100-node  benchmark  problems  first 
presented  by  Krolak  et  al.  [14].  In  addition,  the  50  solutions  found  by  Padberg  and  Hong  [21] 
to  the  318-node  problem  of  Lin  and  Kernighan  [16]  using  a variant  of  the  2-opt  are  also 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 


73 


analyzed.  Table  2 reveals  that  this  procedure  also  has  shortcomings.  Furthermore,  note  that  if 
the  confidence  coefficient  is  increased  from  0.93  to  0.99  then  the  confidence  interval  [13961.23, 
i_  22382.39]  for  problem  28  becomes  uninformative. 


TABLE  2.  Computational  Results— Interval  Estimation 
Using  the  Robson-  Whitlock  Approach 


Problem 

Identification 

V 

*<2) 

95%  Confidence 
Interval 

Optimal 

Solution 

Does  Interval 
Include  Optimum? 

24  [14] 

21518.99 

21545.79 

[21009.79,  21518.99T 

21282. 

Yes 

25  [14] 

22816.55 

22881.52 

[21582.12,  22816.55] 

22148. 

Yes 

26  [14] 

20971.54 

20975.48 

[20896.68,  20971.54] 

20749. 

No 

27  [14] 

21807.23 

21827.43 

[21423.43,  21807.23] 

21294. 

No 

28  [14] 

22382.39 

22447.25 

[21150.05,  22382.39] 

22068. 

Yes 

318  nodes  [16] 

41415.00 

41460.00 

[40560.00,  41415.00] 

41349. 

Yes 

This  procedure  has  a number  of  deficiencies.  Primarily,  the  confidence  statement  is 
distribution-free  and  thus  fails  to  exploit  the  known  characteristics  of  the  Weibull  distribution. 
In  particular,  since  the  Weibull  distribution  is  noticeably  nonuniform,  the  approximate 
confidence  interval  is  probably  not  appropriate  in  this  application. 


4 


Now  we  present  an  interval  estimation  procedure  which  seems  to  perform  quite  satisfac- 
torily. From  equation  (2)  we  see  that 

(5)  FX(  (a  + b)  - 1 - e~l  = 0.63, 

which  enables  us  to  write 

Prob  {v  < a + 6}  - 1 - Prob  {v  > a + 6} 

- 1 - [1  - FXi  (a  + 6)]  [1  - FX}  (a  + 6)1  • ■ • [1  - Fx$  (a  + 6)} 

- 1 - e-s.or 

(6)  Prob  |v-6<a<vj-l  - e~s. 

Thus,  a 100(1  - e-A)%  confidence  interval  for  a is  [v  — b,  v)  when  b is  fixed.  Since  b is  sel- 
dom known,  we  now  present  a procedure  for  constructing  an  approximate  confidence  interval 
for  the  location  parameter  a. 


STEP  1 

: Compute  xJt  ....  xs. 

STEP  2 

: Rearrange  the  observations  from  smallest  to  largest  to  obtain  v — x(1),  x(2) 
x (5) , and  find  the  median  xu. 

STEP  3 

: Determine  good  initial  parameter  estimates  from  the  expressions 

(7) 

a* 

1 

< 

1 

M 

1 

< 

(8) 

b “ *(|0.63S)  +1)  “ A* 

i 


T 


74 


B.  L.  GOLDEN  AND  F.  B.  ALT 


where  [z]  is  the  largest  integer  less  than  z,  and 

(9)  i - In(O  S)] 

\n(xM  — a)  — InO 

STEP  4:  Solve  the  maximum-likelihood  equations  for  "improved'  parameter  estimates 

&,  b,  {. 

STEP  S:  An  approximate  confidence  interval  is  given  by  [y  - b,  v]. 

Some  comments  regarding  this  algorithm,  especially  Step  3,  are  in  order.  Equations  (7) 
and  (8)  follow  from  equations  (4)  and  (5),  respectively.  The  Weibull  cumulative-distribution 
function  yields  (upon  taking  logarithms  twice)  the  following  equation 

00)  In  {-  In  [1  - /r(x0)l)  - cln(x0  - a)  - cln  h 

If  we  replace  a,  b,  and  c with  their  estimates  and  let  x0  - xM,  equ  u>  esults.  Zanakis 
[26]  in  his  comparison  of  procedures  for  deriving  maximum-likelifu  t uaiea  for  the  three- 
parameter  Weibull  distribution  suggests  that  good  initial  parameter  es  mates  may  reduce  the 
computation  time  and  increase  the  accuracy  of  an  algorithm;  in  addition,  he  finds  the  Harter- 
Moore  gradient-search  [9]  technique  to  be  the  most  accurate  code  for  performing  Step  4,  in 
general.  We  use  this  code  in  our  computational  experiments.  The  data  analyzed  in  Table  2 is 
analyzed  again  in  Table  3. 


TABLE  3.  Computational  Results— 100(1 —e  s)% 
Confidence  Intervals* 


Problem 

Identification 

* 

(i 

b 

7 

(v-it  v) 

Optimal 

Solution 

Does  Interval 
Include  Optimum? 

24  114) 

21518.99 

21454.77 

926.36 

L76 

(20592.63,  21518.99) 

21282. 

Yes 

25  (14) 

22816.55 

22754.95 

856.92 

1.92 

(21959.63,  22816.55] 

22148. 

Yes 

26  (141 

20971.54 

20763.87 

1260.75 

2.31 

(19710.79,  20971.54] 

20749. 

Yes 

27  (14) 

21807  23 

21779.19 

837.82 

1.51 

[20969.41,  21807.231 

21294. 

Yes 

28  (141 

22382.39 

22190  20 

1415.64 

2.62 

(20966.75,  22382.391 

22068. 

Yes 

318  nodes  [161 

41415.00 

41365.38 

609.58 

1.96 

[40805.42,  41415.00) 

41349. 

Yes 

•Far  these  problems,  S - 50  and  e~s  - 1.92874  * 10  ”.  This  implies  that  1 - e'sis  practically  one. 


Notice  that  the  width  of  the  confidence  interval  (b ) is  never  more  than  about  6.5%  of  the 
optimal  solution  and  that  the  optimal  solution  is  contained  in  the  interval  in  every  case.  We 
expect  this  to  be  true  almost  always  since  1 - e~s  approaches  1 very  quickly  as  S increases.  It 
is  not  surprising  that  this  technique  works  well  when  one  considers  that  the  Weibull  assumption 
is  taken  advantage  of,  only  one  estimated  parameter  enters  into  the  confidence  statement  expli- 
citly, and  the  confidence  coefficient  is  extremely  close  to  unity.  The  simplicity  of  the  statement 
is  an  additional  advantage. 

Since  there  are  so  few  TSP’s  of  over  100  nodes  where  the  optimal  solution  is  known,  we 
take  advantage  of  the  work  of  Beardwood  et  al.  Ill  who  derivedan  asymptotic  expected-length 
formula  for  an  optimal  traveling  salesman  tour  when  the  nodes  are  distributed  randomly  and 
uniformly  over  a rectangular  area  of  R units.  The  expected  length  of  the  optimal  TSP  tour, 
L(n,R),  is  given  by 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 


75 


(11)  L(n,R)  - K y/n  Vfl . 

In  extensive  computational  experiments,  Stein  [24]  obtains  the  following  empirical  bounds  on 
the  constant  K: 

(12)  0.765  < K ^ 0.765  + -. 

n 

In  Table  1 of  Golden  [6],  networks  of  from  70  to  130  nodes  were  generated  uniformly  in  a 
square  of  area  10,000,  heuristic  solutions  to  the  TSP  were  calculated,  and  Weibull  point  esti- 
mates were  computed.  Confidence  intervals  for  the  location  parameters  in  these  problems  are 
developed  in  Table  4 where  0.765  yfn  y/R  is  taken  as  the  presumed  optimal  solution. 


TABLE  4.  Computational  Results— 100(1—  e s)% 
Confidence  Intervals* 


Problem 

Identification 

V 

ra 

Optimal 
Solution 
(76.5  yfK) 

Does  Interval 
Include  Optimum? 

70  nodes  [6] 

659.39 

(621.61,  659.39] 

640.04 

Yes 

80  nodes  [6] 

[662.69,  700.46] 

684.24 

Yes 

90  nodes  [6] 

747.24 

[711.50,  747.24] 

725.74 

Yes 

100  nodes  [6] 

785.73 

[668.31,  785.73] 

765.00 

Yes 

110  nodes  [6] 

832.44 

[720.79,  832.44] 

802.34 

Yes 

120  nodes  [6] 

915.36 

[833.06,  915.36] 

838.02 

Yes 

130  nodes  [6] 

917.56 

[825.84,  917.56] 

872.23 

Yes 

•For  the  first  four  problems,  S — 25  and  e~s  - 1.38879  x 10*M.  For  the  last  three  problems,  S — 
30  and  e~s  — 9.35762  x 1(T14  Again,  1 -e~s  is  practically  one. 


Again,  in  all  cases  the  interval  includes  the  optimum.  Also,  the  intervals  are  fairly  tight 
but  not  as  tight  as  the  Table  3 intervals.  This  is  partly  due  to  the  fact  that  the  sample  size  in 
Table  4 is  smaller  than  it  is  in  Table  3 (see  Ref.  [6]  for  details). 

In  addition  to  showing  the  analysis  of  a heuristic  solution  v to  a particular  combinatorial 
problem.  Tables  3 and  4 provide  a basis  for  comparing  the  accuracy  of  various  heuristic  pro- 
cedures. For  example,  the  heuristic  solutions  x,  to  the  318-node  problem  were  obtained  by  the 
Lin-Kernighan  algorithm,  which  is  a substantial  improvement  on  the  2-opt  heuristic.  From 
Table  3,  we  can  see  that  the  performance  measure  b/v  (a  measure  of  relative  interval  width)  is 
generally  smaller  when  the  more  powerful  heuristic  is  used.  The  heuristic  solutions  x , studied 
in  Table  4 were  obtained  via  the  Clarke-Wright  algorithm,  which  is  not  as  powerful  as  the  2-opt 
procedure  [7].  Again  the  performance  measure  b/v  reflects  this  fact.  That  is,  in  a relative 
sense  the  intervals  are  tighter  when  using  the  2-opt  procedure.  Thus,  the  more  powerful  the 
heuristic,  the  tighter  are  the  confidence  intervals  that  can  be  expected. 

CONCLUSIONS 

In  this  paper,  we  have  introduced  a statistical  approach  for  attacking  intractable  mathemat- 
ical optimization  problems.  Straightforward  and  efficient  procedures  have  been  developed  to 
perform  interval  estimation  in  the  case  where  the  globally  optimal  solution  is  unknown.  The 


76 


B.  L.  GOLDEN  AND  F.  B.  ALT 


interval-estimation  procedure  has  been  quite  successful.  The  KMX  1 -.e  s)%  confidence  inter- 
vals derived  are  of  the  simple  form  [v  - b,  v]. 

Much  work  remains  to  be  done.  Interval-estimation  construction  on  additional  large  trav- 
eling salesman  test  problems  should  be  pursued,  and,  of  course,  we  would  like  to  see  this  tech- 
nique applied  to  other  combinatorial  problems. 

ACKNOWLEDGMENTS 

We  wish  to  thank  Prof.  H.  Leon  Harter  for  making  his  computer  program  available  to  us 
in  this  study.  In  addition,  we  thank  Prof.  Jim  Yee  for  helpful  comments.  The  research  of  the 
first  author  was  supported  in  part  by  a University  of  Maryland  General  Research  Board  Faculty 
Research  Award. 


REFERENCES 

[1]  Beardwood,  J.,  J.  Halton,  and  J.  Hammersley,  "The  Shortest  Path  Through  Many  Points," 

Proceedings  of  the  Cambridge  Philosophical  Society  55,  299-327  (1959). 

[2]  Clough,  D.,  "An  Asymptotic  Extreme  Value  Sampling  Theory  for  the  Estimation  of  a 

Global  Maximum,"  Canadian  Operational  Research  Society  Journal  7,  102-115  (1969). 

[3]  Dannenbring,  D.,  "Estimating  Optimal  Solutions  for  Large  Combinatorial  Problems," 

Management  Science  23,  1273-1283  (1977). 

[4]  Fisher,  R.,  and  L.  Tippett,  "Limiting  Forms  of  the  Frequency  Distribution  of  the  Largest 

or  Smallest  Member  of  a Sample,"  Proceedings  of  the  Cambridge  Philosophical  Society 
24,  180-190  (1928). 

[5]  Garey,  M.,  and  D.  Johnson,  "Approximation  Algorithms  for  Combinatorial  Problems:  An 

Annotated  Bibliography,"  in  Algorithms  and  Complexity:  New  Directions  and  Recent 
Results,  J.  Traub,  ed.  (Academic  Press,  New  York,  1976),  pp.  41-52. 

[6]  Golden,  B..  "A  Statistical  Approach  to  the  TSP,"  Networks  7,  209-225  (1977). 

[7]  Golden,  B.,  "Point  Estimation  of  a Global  Optimum  for  Large  Combinatorial  Problems," 

Communications  in  Statistics  B7,  361-367  (1978). 

[8]  Gumbel,  E.,  Statistics  of  Extremes  (Columbia  University  Press,  New  York,  1958). 

[9]  Harter,  H.,  and  A.  Moore,  "Maximum  Likelihood  Estimation  of  the  Parameters  of  the 

Gamma  and  Weibull  Populations  from  Complete  and  from  Censored  Samples,"  Tech- 
nometrics 7,  639-643  (1965). 

[10]  Held,  M.,  and  R.  Karp,  "The  Traveling  Salesman  Problem  and  Minimum  Spanning 

Trees,"  Operations  Research  18,  1138-1162  (1970). 

[11]  Held,  M.,  and  R.  Karp,  "The  Traveling  Salesman  Problem  and  Minimum  Spanning  Trees: 

Part  II,"  Mathematical  Programming  1,  6-25  (1971). 

[12]  Karp,  R.,  "On  the  Computational  Complexity  of  Combinatorial  Problems,"  Networks  5, 

45-68  (1975). 

[13]  Klein,  S.,  "Monte  Carlo  Estimation  in  Complex  Optimization  Problems,"  Doctoral  Disser- 

tation, The  George  Washington  University,  Washington,  D.C.  (1975). 

[14]  Krolak,  P.,  W.  Felts,  and  G.  Marble,  "A  Man-Machine  Approach  Toward  Solving  the 

Traveling  Salesman  Problem,"  Communications  of  the  Association  for  Computing 
Machinery  (ACM)  14,  327-334  (1971). 

[15]  Lin,  S.,  "Computer  Solutions  of  the  TSP,"  Bell  System  Technical  Journal  44,  2245-2269 

(1965). 

[16]  Lin,  S.,  and  B.  Kemighan,  "An  Effective  Heuristic  Algorithm  for  the  Traveling  Salesman 

Problem,"  Operations  Research  21,  498-516  (1973). 

[17]  Mann,  N.,  R.  Schafer,  and  N.  Singpurwalla,  Methods  for  Statistical  Analysis  of  Reliability 

and  Life  Data  (John  Wiley  and  Sons,  New  York,  1974). 


INTERVAL  ESTIMATION  OF  A GLOBAL  OPTIMUM 


77 


118]  McRoberts,  K.,  "A  Search  Model  for  Evaluating  Combinatorially  Explosive  Problems," 
Operations  Research  19,  1331-1349  (1971). 

[19]  Mosteller,  F.  "Note  on  Application  of  Runs  to  Quality  Control  Charts,"  Annals  of 

Mathematical  Statistics  12,  228-232  (1941). 

[20]  Nijenhuis,  A.,  and  H.  Wilf,  Combinatorial  Algorithms  (Academic  Press,  New  York,  1975). 

[21]  Padberg,  M.,  and  S.  Hong,  "On  the  Symmetric  Traveling  Salesman  Problem:  A Computa- 

tional Study,"  Working  Paper  No.  77-89,  Graduate  School  of  Business  Administration, 
New  York  University,  New  York,  N.Y.  (1977). 

[22]  Robson,  D.,  and  J.  Whitlock,  "Estimation  of  a Truncation  Point,"  Biometrika  51,  33-39 

(1964). 

[23]  Sahni,  S.,  "General  Techniques  for  Combinatorial  Approximation,"  Operations  Research 

25,  920-936  (1977). 

[24]  Stein,  D.,  "Scheduling  Dial-A-Ride  Transportation  Systems:  An  Asymptotic  Approach," 

Doctoral  Dissertation,  Harvard  University,  Cambridge,  Mass.  (1977). 

[25]  Swed,  F.,  and  C.  Eisenhart,  "Tables  for  Testing  Randomness  of  Grouping  in  a Sequence 

of  Alternatives,"  Annals  of  Mathematical  Statistics  14,  66-87  (1943). 

[26]  Zanakis,  S.,  "Computational  Experience  with  Some  Nonlinear  Optimization  Algorithms  in 

Deriving  Maximum  Likelihood  Estimates  for  the  Three-Parameter  Weibull  Distribu- 
tion," TIMS  Studies  in  the  Management  Sciences  7,  63-77  (1977). 


LEAST-ABSOLUTE- VALUE  ESTIMATORS 
FOR  ONE-WAY  AND  TWO-WAY  TABLES 


R.  D.  Armstrong  and  E.  L.  Frome 

University  of  Texas  at  Austin 
Austin,  Texas 

ABSTRACT 

This  paper  concerns  itself  with  the  problem  of  estimating  the  parameters  of 
one-way  and  two-way  classification  models  by  minimization  of  the  sum  of  the 
absolute  deviations  of  the  regression  function  from  the  observed  points.  The 
one-way  model  reduces  to  obtaining  a set  of  medians  from  which  optimal 
parameters  can  be  obtained  by  simple  arithmetic  manipulations.  The  two-way 
model  is  transformed  into  a specially  structured  linear  programming  problem, 
and  two  algorithms  are  presented  to  solve  this  problem.  The  occurrence  of  al- 
ternative optimal  solutions  in  both  models  is  discussed,  and  numerical  exam- 
ples are  presented. 


INTRODUCTION 

An  important  problem  in  statistics  is  the  study  of  the  effect  of  one  or  two  factors  on  a 
dependent  variable.  This  problem  can  be  formulated  as  a regression  analysis  using  dummy 
(0,1)  variables  to  represent  the  levels  of  the  factors,  and  the  resulting  least  squares  analysis 
(LSQ)  is  well  known  [291.  Recently,  the  least-squares  approach  has  come  under  considerable 
criticism,  and  several  resistant  estimation  procedures  have  been  proposed  [1,19-22].  Along 
with  the  resistant  estimation  techniques  has  come  an  increased  computational  burden,  and  in 
some  cases  subjective  decisions  concerning  outliers  [14,27,31],  weight  functions  [6],  and  icore 
functions  [21]  must  be  made  by  the  statistician.  Minimizing  the  sum  of  the  absolute  values  of 
the  residuals  is  a robust  procedure  [19]  which  in  some  cases  bypasses  the  latter  difficulty.  The 
computations  involved  in  obtaining  least-absolute-value  (LAV)  estimates  have  been  a major 
deterrent  to  its  use.  This  paper  will  demonstrate  how  LAV  estimates  can  efficiently  be  obtained 
for  one-way  and  two-way  tables.  A second  difficulty  in  LAV  estimation  is  the  existence  of 
alternate  optimal  solutions.  For  one-way  and  two-way  tables  alternative  optimal  solutions  will 
frequently  exist.  However,  a unique  solution  can  be  obtained  by  placing  appropriate  restrictions 
on  the  parameters.  From  a data-analytic  point  of  view  this  may  be  regarded  as  an  advantage 
since  it  requires  some  careful  thought  in  the  selection  of  additional  contraints  that  will  yield  a 
"good"  solution.  In  some  situations  this  simple  row-plus-column  fit  prpvides  a first  step  in  data 
analysis.  The  fitted  values  and  residuals  are  used  to  identify  possible  outliers  or  to  suggest  how 
an  improved  fit  may  be  obtained  (see,  e.g.,  [1],  [6]).  The  LAV  estimates  will  also  provide  a 
good  starting  point  for  resistant  procedures  that  are  iterative  and  require  residuals  from  .an  ini- 
tial fit  to  initiate  the  procedure. 


79 


80 


R.  D.  ARMSTRONG  AND  E.  L.  FROME 


Charnes,  Cooper,  and  Ferguson  [12]  appear  to  be  the  first  to  present  a practical  approach 
to  the  solution  of  the  general  linear  LAV  problem.  They  demonstrated  how  the  problem  could 
be  transformed  into  a linear  programming  problem  and  thus  be  solved  by  use  of  the  well- 
developed  theory  of  linear  programming  (LP).  They  also  proved  the  statistical  consistency  of 
the  estimates  for  LAV  or  any  other  norm-functional.  Other  papers  primarily  concerned  with 
the  use  of  LP  to  solve  LAV  problems  are  Refs.  [2,4,5,28,30].  The  main  point  to  be  gleaned 
from  the  more  recent  of  these  references  is  that  a special-purpose  primal  simplex  algorithm  has 
proven  to  be  the  most  efficient  method  for  the  solution  of  linear  LAV  problems.  A reasonable 
alternative  to  the  special-purpose  primal  algorithm  is  to  take  the  dual  of  the  original  LP  prob- 
lem and  solve  it  via  an  LP  code  with  simple  upper  bounding.  Section  3 outlines  this  transfor- 
mation for  a two-factor  model  while  Wagner  [32]  gives  a more  detailed  presentation  for  the 
general  case.  Computational  results  [3]  indicate  that  the  dual  approach  takes  approximately 
four  times  as  long  as  the  special-purpose  primal  algorithm,  but  the  algorithm  for  solving  the 
dual  has  the  definite  advantage  of  being  more  widely  available. 

In  Section  2,  we  demonstrate  how  LAV  estimates  can  be  obtained  for  a one-factor  model 
without  LP.  In  Section  3,  two  computer-oriented  approaches  for  the  analysis  of  a two-factor 
model  using  LP  are  presented.  Both  methods  exploit  the  topological  structure  of  LP  problems 
to  provide  efficient  solution  techniques.  Section  4 presents  sufficient  conditions  for  alternative 
optimal  solutions  to  exist  when  additional  criteria  are  not  present.  Examples  illustrating  LAV 
estimation  for  one-way  and  two-way  tables  are  given  in  Section  5. 

2.  ONE-WAY  CLASSIFICATION  MODEL 


Suppose  it  is  hypothesized  that  observed  values  of  a random  variable  are  affected  by  t lev- 
els of  a certain  factor.  A statistical  model  to  study  these  effects  may  be  stated  as  follows: 

>/*  - M + tt  + */*;»-  1.  2,  ....  r;  k - 1,  2,  ....  n,; 

where  ylk  is  the  fc-th  observation  at  the  >-th  level,  n is  a typical  value,  t,  is  the  effect  associated 
with  the  Ath  level,  and  eik  is  an  unobservable  random  "error”. 


The  LAV  estimates  for  n and  r„  / — 1,2 r,  by  definition  solve  the  following  prob- 

lem: 

i "t 

(2.1)  Minimize  z - £ £ \ylk  - (m  + t,)  | . 

/-i  *-i 


An  immediate  difficulty  arises  because  we  have  one  degree  of  freedom  in  choosing  values  for 
the  parameters;  that  is,  m or  any  one  of  the  r,’s  may  be  assigned  an  arbitrary  value  without 
affecting  the  optimal  value  of  z.  The  same  difficulty  arises  in  LSQ  estimation,  and  it  is  averted 
by  assuming  that  the  total  of  the  effects  should  be  zero.  Thus,  the  degree  of  freedom  is 
absorbed  by  the  constraint: 

(2.2)  £r,-0. 

/-i 


Iq  a LAV  analysis,  this  degree  of  freedom  must  also  be  removed  by  an  additional  con- 
straint, but  now  the  form  of  the  constraint  is  not  so  obvious.  To  see  this,  consider  the  t dis- 
joint problems: 

"/ 

Minimize  £ |y/jk  - a,|.  i - 1,  2,  ....  I, 

k- 1 


(2.3) 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


where  a,  — fi  + t„  / - 1 f.  An  optimal  value  for  a,,  say  a„  is  the  median  of  the  points 

y,k,  Ar-1,  2,  It  then  follows  that,  if  we  were  using  (2.2)  as  a constraint  on  the  t/s,  the 

optimal  solution  would  be: 


The  /I  given  by  (2.4)  is  the  arithmetic  mean  of  t medians.  One  reasonable  alternative  would  be 
to  choose  fi  to  be  the  median  of  all  observations,  but,  to  parallel  the  LSQ  analysis  as  closely  as 
possible,  we  take  a different  approach.  First  note  that  (2.2)  is  equivalent  to  our  taking  up  the 
degree  of  freedom  by  choosing  n and  r„  /'-l,  2,  ....  /,  so  as  to  minimize 


while  still  providing  LSQ  estimates.  Correspondingly,  to  obtain  parameters  for  the  LAV  esti 
mate  we  minimize 


while  maintaining  the  minimum  value  for  z. 


When  the  optimal  value  of  a,  is  unique  (i.e.,  the  median  of  the  points  ylk, 

A— 1 , 2,  ....  nh  is  unique)  for  all  /,  fi  is  the  median  of  a„  /-l,  2 /,  and  r,  is  obtained 

from  (2.5).  However,  frequently  the  median  of  the  yik  s is  not  unique  but  rather  can  lie  any- 
where within  a continuous  closed  interval.  When  this  is  the  case,  fi  can  be  obtained  as  follows: 


STEP  1:  Set  U equal  to  the  smallest  lower  bound  of  the  intervals  within  which  the 
optimal  value  of  the  a's  must  lie  (unique  values  have  the  same  upper  and  lower  bound.) 


STEP  2:  Increase  the  value  of  U until  any  further  increase  would  place  more  intervals 
completely  below  U than  there  are  completely  above  U. 


STEP  3:  Place  L equal  to  the  current  value  of  U. 


STEP  4:  Decrease  the  value  of  L until  any  further  decrease  would  place  more  intervals 
completely  above  L than  there  are  completely  below  L. 


All  the  values  in  the  closed  interval  [ L , U ] are  optimal  for  n subject  to  the  additional  cri- 
terion (2.6).  Let  n*  denote  the  median  of  all  the  ylk'»  and  choose  the  point  in  [ L , U]  that 
minimizes  \n  - m*I-  This  criterion  provides  an  estimate  that  is  as  close  as  possible  to  the  esti- 
mate of  it  that  is  obtained  under  the  minimal  one-parameter  model  (i.e.,  under  the  hypothesis 
that  all  the  r' s are  zero).  A similar  prbcedure  will  be  used  to  obtain  estimates  of  the  parame- 
ters in  the  two-factor  model  (see  Section  5).  Once  fi  has  been  chosen  from  within  this  inter- 
val, a i is  chosen  to  be  as  close  to  fi  as  possible  while  remaining  in  the  range  of  optimality  for 
(2.3).  The  r's  are  then  determined  from  (2.5). 


This  LAV  estimate  is  unique.  Although  the  additional  criteria  that  were  added  to  force  a 
unique  solution  are  arbitrary,  they  are  reasonable  for  this  situation.  Other  approaches,  similar 


' I 


r 

t 

I 


82  R.  D.  ARMSTRONG  AND  E.  L.  FROME 

to  the  goal-programming  (constrained  regression)  approach  [8,9,11,24]  proposed  here,  can  be 
used  to  define  a unique  optimal  solution,  or  we  could  let  fi.  — (Z,  + U)/2.  Unless  these  addi- 
tional criteria  are  rather  complex,  LAV  estimates  are  easily  obtained  for  a one-way  table.  How- 
ever, as  we  shall  see  in  the  next  section,  the  extension  of  the  LAV  approach  to  two-way  tables 
is  far  more  complex  than  the  corresponding  LSQ  extension. 

3.  TWO-WAY  TABLE 

3.1  Definition  of  Model  j 

«. 

A two-factor  model  arises  when  a second  factor  is  introduced  as  follows: 

yijk  - M + t,  + 0j  + eIJk,  /- 1 , ....  r; 

7-1.  ....  c; 

*-l tin. 

Thus,  y,Jk  is  the  A-th  observation  at  the  Ath  level  of  the  first  factor  and  the  7-th  level  of  the 
second  factor;  r,  represents  the  effect  of  the  Ath  level  of  the  first  factor  (i.e.,  row  effect),  /3 1 
represents  the  effect  of  the  7-th  level  of  the  second  factor  (column  effect),  and  n is  a typical 
value.  LAV  estimates  of  the  parameters  are  obtained  by  solution  of  the  following  problem: 

(3.1)  Minimize  \y,Jk  ~ (m  + t,  + /3y)|. 

/-I  7-1  *-l 

There  are  two  degrees  of  freedom  in  the  assignment  of  values  to  m,  T/,  and  /3y;  thus,  res- 
trictions should  be  added  to  the  problem.  In  the  corresponding  LSQ  analysis,  (2.2)  and 

(3.2)  107-0 

7-1 

are  appended.  This  is  equivalent  to  the  provision  of  LSQ  estimates  which  minimizes 

±r?  + ±flj- 

—I  7-1 

Analogously,  in  the  LAV  analysis  we  provide  estimates  that  minimize 
(3-3)  tkl+t  |/8y| 

/-I  7-1  ' 

subject  to  the  optimal  value  for  z in  (3.1)  being  maintained.  As  in  the  one-way  analysis,  this 
additional  criterion  does  not  necessarily  provide  a unique  solution  (see  Section  S for  an  alterna- 
tive). Further  restrictions,  or  a completely  different  set  of  criteria,  may  determine  a unique 
solution.  Our  purpose  is  to  present  what  we  feel  are  reasonable  conditions  for  LAV  estimates 
to  satisfy. 

4 

3.2  Computational  Approaches 

Before  (3.3)  is  considered,  two  computer-oriented  approaches  used  to  obtain  an  optimal 
solution  to  (3.1)  will  be  discussed.  We  again  make  the  transformation  a,  - n + r„  and  restate 
(3.1)  as: 

(3.4)  Minimize  z - J |_y/y*  - (a,  + /8,)|. 

I 7 * 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


83 


Problem  (3.4)  can  be  written  as  a linear  programming  problem: 


(3.5) 

subject  to 


Minimize  ii 

i i * 


«,  + Pi  - y,jk  + dtjk  - dijk  - 0, 

djjk  ^ 0,  dijk  > 0, 


/'-l.  ....  r;  y-1.  ...  , c;  *-l,  ....  n,j, 

where  d,fk  and  d,Jk  are  the  positive  and  negative  deviations  of  the  regression  equation  from  the 
observation  yok,  respectively.  Problem  (3.5)  is  not  tractable  in  its  present  form  for  a direct 
application  of  the  simplex  algorithm.  The  main  reason  for  this  is  that  the  number  of  con- 
straints is  equal  to  the  number  of  observations,  which  may  give  rise  to  an  excessively  large 
basis  matrix.  This  difficulty  can  be  overcome  by  taking  the  dual  of  (3.5),  which  is  given  by: 

(3.6)  Maximize  £ I 1 3V*  *r ,y* 

/ j k 


subject  to 


I I».ji  ,-1 >•; 

7 * 

I Iff»7*  “0,  7-1 c; 

/ k 

1 ^ TT  ijk  ^ •••»  T i t •••»  C;  fc  — 1,  • • • i /l/y. 


By  making  the  transformation  ir/7*  — tt;^  + 1,  we  can  write  (3.6)  in  a more-standard 
linear  programming  format: 

O.7)  Maximize  I I I (y,y*  ir]Jk  - y,jk) 

< 7 * 

subject  to 

I I ^ ijk  “ I ^17*  ^ ' • • • * r, 

7 * 7 

I I ^ jjk  “ I fl(y,  7“1»  **•  t C 

/ *•  / 

and 


0 < nuk  < 2,  /-I r;  7—1 c;  *-l /t„. 

It  can  now  be  recognized  that  (3.7)  is  a capacitated  transportation  problem  [10]  with  r ori- 
gins and  c destinations  except  that,  because  of  multiple  observations  within  cells,  there  is  more 
than  one  path  from  origin  i to  destination  J.  We  can  incorporate  this  extension  into  the  stan- 
dard LP  algorithm  by  considering  irIJk ’s  for  entry  into  the  basis  only  when  all  other  LP  variables 
corresponding  to  observations  in  cell  (/,  J)  with  a value  larger  than  yuk  are  at  their  upper 
bound.  Computational  results  [15,16]  indicate  that  transportation  problems  can  be  solved 
approximately  ISO  times  faster  by  using  a special-purpose  primal  simplex  code  as  opposed  to  a 
general-purpose  state-of-the-art  LP  code.  Thus,  considerable  savings  can  be  derived  if  we 
recognize  the  special  structure  of  (3.6). 


84 


R D.  ARMSTRONG  AND  E.  L.  FROME 


I 


i 


Once  (3.6)  has  been  solved,  optimal  values  for  the  a,’s  and  /3/s  in  (3.4)  are  given  by  the 
dual  variables  or  simplex  multipliers  for  the  first  r + c constraints.  There  is,  however,  one 
degree  of  freedom  in  the  choice  of  the  a/s  and  0/s,  and  a second  degree  of  freedom  in  the 
assignment  of  values  to  the  r/s  and  /x.  These  degrees  of  freedom  can  be  taken  up  if  we  satisfy 
criterion  (3.3).  We  delay  the  discussion  of  how  to  accomodate  (3.3)  until  after  the  primal 
approach  to  (3.S)  has  been  presented. 

For  the  general  problem  of  obtaining  parameters  for  absolute-deviations  estimates,  it  has 
been  shown  [3,4,30]  that  solving  the  general-case  equivalent  of  (3.S)  directly  with  a special- 
purpose  primal  algorithm  is  computationally  the  most  efficient  approach  available.  There  is  no 
reason  to  believe  that  this  would  not  also  be  true  here  as  the  structure  of  the  problem  can  still 
be  utilized  to  perform  the  operations  of  the  algorithm  without  a matrix  ever  being  inverted 
explicitly.  This  algorithm,  modified  to  take  advantage  of  (3.5)’s  structure,  will  not  be 
developed  here,  but  a brief  overview  is  given  to  indicate  the  use  of  techniques  found  in  solving 
transportation  problems  and  to  state  a formula  required  in  the  next  section. 

We  begin  by  restating  the  constraints  of  (3.S)  in  matrix  notation  as  follows: 

(3.8)  Ay-  Y + D+  - D~  ~ 0. 

(3.9)  D+  > 0,  D~  > 0, 

where  y - (a, a,,  B\.  ...  , /3f),  and  A is  a |£,£/j,7|  by  (r  + c)  matrix  of  0’s  and  l’s 

with  a single  dependent  column.  It  is  clear  from  the  objective  function  of  (3.5)  that  (3.8)  can 
also  be  written  as: 

(3.10)  Y - D+  < Ay  < Y + D~. 

The  algorithm  at  any  stage  works  with  a basis  consisting  of  (r  + c - 1)  rowr,  of  A.  To  distin- 
guish between  the  basic  and  nonbasic  rows  of  A we  partition  A,  Y,  £>+,  and  D~  and  rewrite 
(3.10)  as 


Yb 

Df 

B 

y 

r„ 

Df 

K 

Df 

< 

< 

Yn 

+ 

D„~ 

where  B denotes  the  basis  and  N the  remaining  rows  of  A.  If  we  let  A - By , then  the  con- 
straints (3.8)  become 

Y„  - Df  < A < Yh  + Df 
Y„  - Df  < NB*k  < Yn  + Df. 

where  B*  is  a generalized  inverse  of  B.  The  current  solution  is  A * - Yh , Df  - Df  - 0, 
y * - B*\  *,  and  the  deviations  in  the  nonbasic  rows  are  as  srpall  as  possible  based  on  A * and 
the  constraints.  The  structure  of  fallows  it  to  be  stored  as  a spanning  tree  [15]  similar  to  that 
of  the  basis  of  a transportation  problem.  This  allows  us  to  use  the  triangularity  of  B (after  a 
dependent  column  is  dropped)  to  solve  By  - A without  our  explicitly  obtaining  a B* . Thus, 
B*  is  not  required  to  obtain  a basic  solution;  and,  in  fact,  it  is  never  needed. 

The  next  step  in  the  algorithm  is  to  determine  whether  on  not  an  increase  or  decrease  in 
any  k,  away  from  its  current  value  A * will  decrease  the  objective  value.  The  objective  function 
rate  of  change  is  1 + 9{  or  1-0,  when  A,  is  increased  or  decreased,  respectively,  where 
0 - (0|,  0j 0,+c-i)  is  given  by 

0-  ZNjB*- 

J I [ / i I 


j 

i 


(3.11) 


T 


| 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


85 


2>y-  1*0- 

J J 

In  (3.11)  the  + and  - superscripts  indicate  summation  over  rows  of  N with  positive  and  nega- 
tive deviations,  respectively.  When  a nonbasic  row  has  a zero  residual  it  is  classified  by  the  al- 
gorithm as  a positive  or  negative  zero;  thus,  every  nonbasic  row  will  be  included  exactly  once  in 
the  above  summation.  Again,  the  triangularity  of  B allows  us  to  obtain  9 without  B*  (just  as  in 
the  transportation  algorithm  a primal  solution  is  obtained).  Equality  (3.11)  will  be  important  in 
the  next  section,  where  conditions  for  alternative  solutions  to  the  LP  problem  (3.5)  are  dis- 
cussed. 


Since  we  are  minimizing  the  sum  of  the  absolute  deviations,  a basis  change  would  be 
made  if  |0,|  > 1.  Hence,  the  current  solution  is  recognized  as  optimal  when  |0,|  < 1,  / — 1, 

2,  ....  r + c — l. 

The  pivot  rule  of  Barrodale  and  Roberts  [4],  which  may  combine  several  standard  simplex 
pivots  into  one,  is  used  to  determine  the  row  of  N to  enter  the  basis.  The  details  of  imple- 
menting this  rule  while  effectively  utilizing  the  structure  of  B and  N will  not  be  given  here,  but 
the  major  computational  step  is  similar  to  calculating  the  ratio  in  a dual  simplex  transportation 
algorithm  [17]. 


3.3  A Secondary  Criterion  for  Choosing  the  Parameter  Estimates 


The  remainder  of  this  section  will  be  devoted  to  describing  how  the  secondary  criterion 
(3.3)  can  be  considered  within  the  framework  of  an  LP  algorithm.  The  procedure  can  be  util- 
ized on  a revised  version  of  (3.5)  or  (3.7),  and  it  is  similar  to  the  perturbation  method  of 
Chames  [7].  Like  Charnes’  method  an  arbitrarily  small  positive  number  « will  be  used  in  the 
description,  but  the  most  efficient  implementation  would  never  assign  a value  to  e and  all  calcu- 
lations involving  e are  performed  implicitly.  However,  if  we  place  e equal  to  a specific  value 
additional  computer  coding  is  avoided.  We  begin  by  noting  that  (3.3)  can  be  expressed  in  LP 


form  as: 

r+c 

(3.12) 

Minimize  £ (8/  + 8"), 

j- 1 

subject  to 

r,  + 8,+  - 8,-  - 0,  i - 1. 

(3.13) 

Pi  + 8+y  - 8r-+y  - 0,  j - 

8/  > 0 and  8/  > 0,  j - 1 

. , r + c, 

«her«.  b*  and  bj  are  the  positive  and  negative  deviations  of  the  effects  from  zero. 

Problem  (3.12)  is  of  only  secondaty  concern,  since  we  wish  always  to  obtain  an  optimal 
to  (3  3).  The  desired  optimal  solution  is  given  in  a limiting  sense  (e  — > 0)  by  solving 

Minimize  X L L (<#<*  + dji)  + Z («*/  + «*/) 

I i k J- 1 

ti  (31).  (3.9),  and  (3.13). 


IJ  141 


I 


86 


R.  D.  ARMSTRONG  AND  E.  L.  FROME 


Problem  (3.14)  is  not  in  the  form  in  which  the  columns  have  the  exact  structure  that  the 
rows  of  a transportation  problem  possess.  To  obtain  the  desired  format  let  /3C+1  — —fi  and 
create  a "dummy  parameter"  /3f+2  (this  is  a variable  in  the  LP  problem).  The  problem  then 
becomes: 

r+c 

(3.15)  Minimize  X Z Z ( dijk  + du *)  + X + 

/ J k J-\ 

subject  to 

"h  Pj  y tjk  "I-  dijk  dijk  “ o»  ^ “ I*  • • • , r , 

7-1,  ....  c, 
k - 1,  ....  niJt 

«/  + Pc+i  + 8+  — 8“  — 0,  / — 1,  ....  r, 

Pc +2  + Pj  + 8r+,  - S~+J  - 0,  j - 1,  ...  ,c, 

8/>0,  8/  > 0,  > 0.  dIJk  > 0 

where  the  degree  of  freedom  in  assigning  values  to  the  parameters  is  absorbed  by  always  plac- 
ing Pc+2  - 0. 

The  algorithm  described  to  solve  (3.5)  directly  can  be  used  to  solve  (3.15)  with  a slight 
modification  to  account  for  a weight  of  «,  rather  than  one,  on  the  deviations  of  r,  and  fij  away 
from  zero.  Also,  if  we  take  the  dual  of  (3.15)  and  make  the  lower  bound  on  the  variables  in 
this  dual  problem  zero,  a capacitated  transportation  problem  is  again  created.  This  can,  of 
course,  be  solved  with  a standard  code;  however,  we  must  take  care  to  ensure  that  pc+2  - 0 
when  working  back  to  the  optimal  solution  to  (3.15). 

The  formulation  just  described  takes  care  of  the  two  degrees  of  freedom  at  the  expense  of 
creating  an  additional  "source”  and  an  additional  "destination"  in  the  transportation  problem, 
and  the  possibility  of  alternative  optimal  solutions  has  been  reduced  considerably.  The  problem 
of  alternative  optimal  solutions  to  (3.3)  is  discussed  in  the  next  section  along  with  statements 
of  sufficient  conditions  for  alternative  ontimal  solutions  to  exist. 

4.  ALTERNATIVE  OPTIMAL  SOLUTIONS 

A disturbing  aspect  of  LAV  estimation  for  two-way  tables  is  that  alternative  optimal  solu- 
tions frequently  occur,  and  decidedly  different  estimates  are  obtainable.  This  difficulty  may  be 
averted  if  we  specify  additional  criteria  for  the  estimates  to  satisfy.  It  is  the  purpose  of  this  sec- 
tion to  indicate  that  alternative  optimal  "fits"  are  to  be  expected  in  the  analysis  of  two-way 
tables  via  LAV  procedures  if  (3.1)  is  the  sole  criterion.  This  serves  to  emphasize  the  impor- 
tance of  "good”  additional  criteria. 

t 

It  is  well  known  that  the  median  of  an  even  number  of  observations  is  unique  only  when 
the  two  middle  observations  have  the  same  value.  We  obtain  the  parameters  for  the  one-way 
model  (2.3)  by  taking  the  median  of  t sets  of  observations,  and  the  values  will  be  unique  only 
when  all  t medians  are  unique.  However,  we  can  always  obtain  a unique  solution  by  adding  the 
additional  restrictions  described  in  Section  2. 

With  respect  to  the  two-way  model,  it  can  be  shown  that  an  LP  solution  to  (3.5)  is 
optimal  if  the  0 we  obtain  by  solving  (3.11)  satisfies 

-1  < 9;  < 1.  / - 1,  2 r + c - 1. 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


87 


Furthermore,  the  basis  B is  a unique  optimal  basis  only  when 

-1  < 0, < 1,  i - 1,  2 r + c - 1; 

in  other  words,  an  alternative  optimal  basis  exists  if  at  the  completion  of  the  algorithm  0, 
equals  -1  or  +1  for  any  /'.  But  because  Nj  is  a vector  of  0’s  and  l’s,  and  because  £ is  a uni- 
modular  matrix  [18],  0 will  always  have  integer  components.  Hence,  at  optimality  0,  will  equal 
either  -1,  +1  or  0.  This  means  that  the  current  optimal  basis  is  unique  if  and  only  if  all  the 
components  of  0 are  zero,  and  this  will  only  be  true  when 

(4.1)  ±Nj-  ±Nj-0. 

j J 

Condition  (4.1)  forms  the  foundation  for  proving  the  theorem  of  this  section.  It  might  be  well 
to  point  out  at  this  time  that  all  our  results  relate  only  to  alternative  optimal  basic  matrices,  not 
to  alternative  optimal  fits.  However,  an  alternative  optimal  basis  is  equivalent  to  an  alternative 
optimal  fit  whenever  an  optimal  fit  interpolates  exactly  r + c — 1 points.  Theoretically,  for 
fixed  sample  size  this  will  occur  with  probability  1 whenever  the  observations  are  taken  from  a 
continuous  population. 

The  following  theorem  is  concerned  with  the  special  case  of  a two-way  classification  model 
where  nu  — 1 for  all  / and  j. 

THEOREM  4.1:  The  LP  problem  (3.3),  which  is  equivalent  to  the  problem  of  finding 
LAV  estimates  for  a two-way  classification  model  with  exactly  one  observation  per  cell,  will 
have  alternative  optimal  basic  matrices  whenever  the  minimum  of  r and  c is  even. 

PROOF:  Suppose  that  the  LP  problem  has  been  solved  and  an  optimal  basic  matrix  has 
been  obtained.  For  this  matrix  to  be  a unique  optimal  basic  matrix,  condition  (4.1)  must  be 
satisfied.  This  occurs  only  if  for  each  nonzero  component  from  an  Nj  associated  with  a positive 
deviation  there  corresponds  a nonzero  component  from  an  Nj  associated  with  a negative  devia- 
tion. In  other  words,  the  nonbasic  rows  of  A must  contain  an  even  number  of  nonzero 
coefficients  in  each  column,  because  the  sum  of  an  odd  number  of  plus  or  minus  ones  will 
never  be  zero.  The  proof  of  the  theorem  will  consist  of  showing  that,  whenever  the  minimum 
of  c or  r is  even,  there  is  a least  one  column  with  an  odd  number  of  nonzero  components 
( + l’s)  in  the  nonbasic  rows. 

For  explanatory  purposes,  we  assume  r ^ c,  but  the  proof  follows  in  an  analogous 
manner  if  the  reverse  is  true.  There  are  r + c — 1 rows  of  A in  the  basis  B and,  because  B 
forms  a basis,  every  column  of  B must  have  at  least  one  nonzero  entry.  We  note  that  B is  a 
submatrix  of  A and  that  each  row  has  one  and  only  one  nonzero  entry  in  the  first  r components 
and  one  and  only  one  nonzero  entry  ip.  the  last  c components.  Also,  each  of  the  last  c columns 
and  A has  exactly  r l’s  with  the  remaining  coefficients  being  0’s.  To  satisfy  (4.1)  there  must  be 
an  even  number  of  l’s  in  each  of  the  last  c columns  of  N.  Thus,  since  r is  even,  there  must  be 
an  even  number  of  l's  in  these  c columns  of  B.  But  each  column  of  B must  have  at  least  one 
nonzero  entry,  and  then  there  must  be  at  least  two  1 ’s  present.  This  would  require  B to  have 
at  least  2 c > c + r - 1 rows.  Therefore,  at  least  one  of  the  last  c columns  of  B has  a single 
nonzero  entry.  The  proof  of  the  theorem  now  follows  from  the  inability  to  satisfy  (4.1). 

It  is  not  difficult  to  derive  examines  of  two-way  tables  with  a unique  optimal  basis  for  the 
LP  equivalent.  Consider  the  two-way  table  of  Exhibit  1.  This  example  has  a unique  optimal 
basis  matrix  with  the  optimal  fit  interpolating  observations  3,  4,  S,  6,  and  7.  However, 


88 


R.  D.  ARMSTRONG  AND  E.  L.  FROME 


EXHIBIT  1.  Two-way  table  with  a 
unique  optimal  LP  basis  matrix. 


alternative  optimal  basic  matrices  exist  if  the  6 and  8 interchange  position.  Certainly  our  com- 
putational experience  would  indicate  that,  even  if  the  conditions  of  Theorem  4.1  are  not 
satisfied,  alternative  optimal  fits  are  more  likely  to  appear  than  not. 

Furthermore,  a unique  optimal  basis  matrix  does  not  clearly  define  the  estimates  for  the 
parameters.  There  are  two  degrees  of  freedom  that  provide  us  with  the  ability  to  arbitrarily 
choose  values  for  two  parameters  and  remain  optimal.  In  the  previous  sections  we  have  pro- 
posed additional  criteria  that  deal  with  this  problem,  and  we  will  discuss  this  matter  further  in 
the  next  section  via  numerical  examples. 

5.  NUMERICAL  EXAMPLES 

5.1  LAV  Analysis  of  One-Way  Table 

To  illustrate  the  application  of  the  algorithm  described  in  Section  2,  we  will  use  the 
Nebraska  voting  data  shown  in  Exhibit  2 [31,  Chap.  19].  In  this  section  two  separate  one-way 
analyses  will  be  carried  out.  These  results  are  used  in  Section  5.3,  where  the  same  data  is  used 
to  illustrate  the  LAV  analysis  of  a two-way  table. 

EXHIBIT  2.  Nebraska  Voting— Raw  % Democratic  for  11  Counties 
in  12  Presidential  Elections  (unit  — 0.1%) 


County 


'20  I '24  | '28  | '32  | '36  I '40  '44  '48  '52  '56  '60  '64 


305  | 619  510  397  372  | 411 

606  497  363  388 


626  510  407  | 404 

569  450  354 


230 
374  I 218 


150  | 553  426  349  272  472  177  240 

561  425  352  340  360  179  232 

547  472  336  313  436  195  219 

454  661  513  384  379  454  253  307 

352  776  526  463  442  553  337  358  I 360 


Source:  Tukey,  J.W.,  Exploratory  Data  Analysis,  11  (Addison-Wesley,  Reading, 
Massachusetts,  1971). 


First,  we  consider  the  rows  (i.e.,  counties).  The  median  intervals  are  shown  in  Exhibit  3, 
and,  using  the  algorithm  in  Section  2,  we  obtain  (L,  U)  - (325,  342).  Since  the  median 
p.  * - 338,  we  set  p.  - 338  and  obtain  the  fitted  values  and  effects  shown  in  the  last  two 
columns  of  Exhibit  3.  The  same  procedure  is  then  applied  to  the  crlumns  (i.e.,  years)  and  the 
fitted  values  and  effects  are  shown  in  Exhibit  4. 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


89 


EXHIBIT  3 .LAV  One-Way  Analysis 
of  Counties  (i.e.,  Rows  of  Exhibit  2)  for 
the  Nebraska  Voting  Data 


County 

Median  Interval 

Fit 

Effect 

D5 

(240,272) 

272 

-66 

D1 

(252,257) 

257 

-81 

D6 

(265,313) 

313 

-31 

B5 

(270,310) 

310 

-28 

B4 

(291,325) 

325 

-13 

B1 

(305,372) 

338 

0 

D4 

(342,374) 

342 

4 

DO 

(353,358) 

353 

15 

D7 

(360,439) 

360 

22 

D2 

(372,379) 

372 

24 

B7 

(379,384) 

379 

41 

Note:  The  order  has  been  changed  to 

illustrate  the  procedure  for  obtaining 
the  interval  (L,U)  “ (325,342). 


EXHIBIT  4.  LA  V One-  Way  Analysis  of  Years  (i.e..  Columns) 
for  the  Nebraska  Voting  Data 


Year 

20 

'24 

'28 

'32 

'36 

'40 

'44 

'48 

'52 

'56 

'60 

'64 

Effect 

-50 

-86 

-74 

281 

159 

25 

7 

95 

-142 

-98 

-109 

51 

Fit 

288 

252 

264 

619 

497 

363 

345 

433 

196 

240 

229 

389 

In  the  LSQ  analysis  the  best  estimates  of  the  row  and  column  effects  (along  with  the 
overall  mean)  provided  the  solution  to  the  two-way  analysis.  For  the  LAV  analysis  this  is  not 
true,  but  in  Section  5.3  we  propose  obtaining  the  LAV  estimates  for  the  two-way  table  that  are 
as  close  as  possible  to  these  restricted  fits.  It  will  then  be  possible  to  assess  the  relative  impor- 
tance of  the  row  effects  after  column  effects  have  already  been  included  in  the  model. 

5.2  Some  Small  Examples  for  Two-Way  Tables 

To  illustrate  the  difficulties  that  arise  in  choosing  values  for  the  parameters  in  the  two-way 
model,  we  consider  the  table  given  by  Exhibit  5.  We  begin  the  analysis  by  obtaining  LAV  esti- 
mates with  additional  restrictions  rt  + r2  - 0 and  /3,  + 02  - 0.  We  can  obtain  optimal  parame- 
ter values  from  the  LP  by  considering  any  extreme  point  that  defines  a hyperplane  passing 
through  three  of  the  four  observations.  Thus,  four  optimal  extreme-point  solutions  are  possi- 
ble, and  they  are  given  by  Exhibit  6.  All  optimal  solutions  are  given  by  convex  combinations 
of  these  four  points.  Clearly,  a great  deal  of  discrepancy  is  possible  among  optimal  solutions. 
If  we  consider  only  extreme-point  solutions,  the  observation  which  the  hyperplane  does  not 
interpolate  will  have  a residual  with  absolute  value  998  and  the  other  points  will,  of  course, 
have  a zero  residual.  Thus,  the  LP  solution  could  indicate  that  any  one  of  the  four  observa- 
tions has  an  unduly  large  residual  and  could  make  it  a candidate  for  consideration  as  an  outlier. 

If  we  perform  the  LAV  analysis  on  the  same  table  with  additional  criterion  (3.3)  rather 
than  (2.2)  and  (3.2),  a unique  solution  (p  - 1,  T|  - t2  - /3,  - /S2  - 0)  is  obtained.  An  inspec- 
tion of  the  residuals  now  indicates  that  the  observation  in  cell  (2,  2)  might  be  considered  an 
outlier. 


90 


R.  D.  ARMSTRONG  AND  E.  L.  FROME 


EXHIBIT  5.  Sample  Two-Way  Table 
with  a Single  Outlier 


1 

1 

_1_ 

999 

EXHIBIT  6.  Optimal  Extreme-Point  Solutions 
to  the  Two-  Way  Table  of  Exhibit  5 with  the  Constraints 
Ti  + r2  - 0 and  / 3]  + ft  — 0 Added 


Extreme-Point 

Solution 

Ti 

*2 

ft 

ft 

1 

500 

-499 

499 

0 

0 

2 

1 

-499 

499 

-499 

499 

3 

500 

0 

0 

-499 

499 

4 

l 

0 

0 

0 

0 

The  two-by-two  table  of  Exhibit  S is  an  extreme  case,  and  it  was  presented  to  indicate 
what  could  occur  in  the  LAV  analysis  of  two-way  tables  if  caution  is  not  exercised.  Generally, 
such  widely  divergent  solutions  will  not  be  encountered,  regardless  of  the  additional  criteria 
employed  to  absorb  the  degrees  of  freedom. 

The  next  example  (Exhibit  7)  is  a four-by-four  table  from  Tukey  131,  Chap.  22].  With 
the  additional  criterion  (3.3)  appended  to  the  problem,  eight  optimal  extreme  point  solutions 
were  found.  These  are  given  in  Exhibit  8.  The  complete  set  of  optimal  extreme  point  solu- 
tions can  be  generated  118,  p.  166];  however,  the  amount  of  work  required  to  do  so  is  generally 
prohibitive.  No  attempt  was  made  to  generate  all  optimal  extreme  solutions  to  the  two-way 
table  in  Exhibit  7. 

EXHIBIT  7.  Sample  Two-Way  Table 
from  Tukey  [31,  Chap.  22] 


718 

732 

734 

793 

725 

781 

725 

716 

704 

1035 

763 

758 

726 

765 

738 

761 

EXHIBIT  8.  Table  of  Eight  Optimal  Extreme-Point 
Solutions  for  the  LAV  Estimation  Problem  Obtained  from 
Exhibit  7 with  Criterion  (3.3)  Added 


Solution 

m! 

n 

El 

El 

El 

ft 

ft 

* 

ft 

1 

El 

El 

-i 

0 

0 

-12 

27 

Bl 

20 

Irl 

-i 

Cl 

Cl 

-16 

23 

.■rj 

16 

758 

a 

-3 

if 

0 

7 

HLl 

Hna 

758 

-8 

-1 

u 

0 

-32 

7 

-16 

o 

758 

El 

11 

If 

0 

-33 

7 

au 

738 

El 

n 

0 

n 

-13 

27 

u 

20 

741 

§o 

u 

i 

-16 

23 

17 

Kfl 

758 

El 

UL 

0 

i 

-33 

6 

-17 

0 

LEAST-ABSOLUTE-VALUE  ESTIMATORS 


91 


We  note  that  all  the  solutions  of  Exhibit  8 indicate  that  the  row  effects  are  small  relative 
to  the  column  effects.  Also,  the  residual  for  the  outlier  of  cell  (3,2)  is  270  for  all  but  the  last 
solution,  where  it  is  271. 

S.3  LAV  Analysis  of  a Two-Way  Table 

In  Section  S.l  we  considered  the  one-way  analysis  of  the  Nebraska  voting  data.  The  two- 
way  analysis  of  this  data  by  LSQ  is  shown  in  Exhibit  9.  The  LAV  estimates  are  obtained  if  we 
solve  (3.1)  with  the  additional  criterion  that  we  minimize 

(5.1)  Im-mI+L  |r(-T*(l+t 

<-i  y-i 

Subject  to  the  optimal  value  for  z in  (3.1)  being  maintained.  In  (5.1),  the  superscript  * denotes 
the  LAV  estimates  that  are  obtained  for  the  one-way  fits  (see  Section  5.1).  The  robust  elemen- 
tary analysis  we  obtained  using  LAV  is  shown  in  Exhibit  10.  The  stem-  and-leaf  plots,  hinges, 
midspreads,  and  side  values  for  the  residuals  obtained  from  the  LSQ  and  LAV  fits  are  shown  in 
Exhibit  11,  and  the  large  (outside,  i.e.,  past  the  side  values)  residuals  are  identified  in  Exhibit 
12. 


EXHIBIT  9.  Elementary  Analysis  by  Means  (i.e.,  LSQ)  of  the 
Nebraska  Voting  Data  (Unit  — 0.1%) 


County 

Year 

Eff 

Fit 

20 

'24 

'28 

'32 

'36 

'40 

44 

'48 

52 

’56 

'60 

'64 

DO 

IS 

61 

225 

88 

27 

-40 

-40 

-117 

-63 

-115 

-49 

-23 

38 

387 

D1 

91 

61 

8 

106 

-14 

-31 

-41 

-91 

3 

-28 

-41 

-23 

-69 

280 

B1 

-29 

26 

-7 

-28 

15 

14 

9 

-21 

4 

8 

16 

-8 

16 

365 

D2 

62 

96 

-42 

-41 

2 

-20 

25 

0 

-34 

-37 

-13 

2 

16 

365 

D4 

26 

-49 

-47 

-20 

16 

25 

42 

64 

0 

S 

-41 

-22 

15 

364 

B4 

-15 

47 

-33 

-46 

-13 

3 

-6 

-27 

20 

31 

-3 

45 

-15 

334 

D5 

-25 

-35 

-98 

-30 

-5 

30 

-27 

104 

11 

44 

26 

3 

-48 

301 

B5 

11 

19 

-27 

-28 

-12 

27 

35 

-15 

7 

30 

-17 

-29 

-42 

307 

D6 

-0 

-28 

-96 

-49 

29 

5 

2 

55 

16 

11 

14 

42 

-36 

313 

B7 

-31 

-55 

105 

-23 

-19 

-36 

-21 

-15 

-14 

10 

70 

28 

53 

402 

D7 

-104 

-142 

-18 

71 

-26 

23 

21 

63 

49 

40 

39 

-16 

73 

422 

eat 

-48 

-89 

-53 

283 

130 

18 

-2 

68 

-135 

-105 

-101 

32 

349 

0 

Fit 

301 

260 

296 

632 

479 

367 

347 

417 

214 

244 

248 

381 

0 

-349 

EXHIBIT  10.  Nebraska  Voting  Data  from  Exhibit  2 ROBUST 
Elementary  Analysis  by  LA  V (Unit  — 0.1%) 


County 

Year 

Eff 

Fit 

'20 

'24 

'28 

'32 

'36 

'40 

'44 

’48 

52 

'56 

'60 

'64 

DO 

56 

81 

322 

144 

50 

-3 

0 

-83 

-30 

-85 

-13 

0 

7 

345 

D1 

124 

73 

67 

154 

0 

-3 

-9 

-65 

28 

-6 

-13 

-8 

-91 

247 

Bl 

-25 

9 

22 

-10 

0 

13 

II 

-25 

-1 

0 

14 

-23 

23 

361 

D2 

•79 

92 

0 

-10 

0 

-8 

40 

10 

-26 

-32 

-I 

0 

10 

348 

D4 

33 

-63 

-15 

1 

4 

27 

47 

64 

-1 

0 

-39 

-34 

19 

357 

B4 

-7 

34 

0 

-24 

-24 

6 

0 

-26 

19 

27 

0 

34 

-13 

325 

D5 

-9 

-40 

-57 

0 

-8 

41 

-13 

112 

18 

48 

36 

0 

-53 

285 

B5 

13 

0 

0 

-12 

-29 

24 

35 

-20 

0 

20 

-20 

-46 

-33 

305 

D6 

0 

-49 

-70 

-34 

10 

0 

0 

48 

8 

-1 

9 

24 

-25 

313 

B7 

-9 

-54 

153 

14 

-15 

-18 

0 

0 

0 

21 

87 

32 

4( 

379 

D7 

-122 

-181 

-10 

68 

-63 

o 

2 

38 

23 

II 

16 

-52 

102 

440 

Eff 

-48 

-68 

-78 

268 

149 

23 

0 

75 

-126 

-93 

-96 

51 

338 

0 

Fit 

290 

270 

260 

606 

487 

361 

338 

413 

212 

245 

242 

389 

0 

-338 

LAV 


LS< 


4 

L LLLL 

2 

L 

LL 

7 

-9  751 

-9 

-8 

4 

-8 

35 

-7 

5 

-7 

0 

8 

-6  3 

8 

-6 

335 

9 

-5  4 

11 

-5 

472 

19 

-4  9771580911 

14 

-4 

096 

28 

-3  043914957 

19 

-3 

40294 

47 

-2  9487809815062772229 

29 

-2 

5449560603 

60 

-1  3932169543375 

39 

-1 

0002333558 

(7) 

-0  7360038 

64 

-0 

799000083389001 1060118000 

65 

0 0335923368524 

(20) 

0 

09001004600002080009 

52 

1 51516749161254 

48 

1 

34039098946 

33 

2 669573526 

37 

2 

012344778 

29 

3 0151090 

28 

3 

3458642 

22 

4 7255153 

21 

4 

10788 

15 

5 5 

16 

5 

60 

14 

6 21143 

14 

6 

784 

9 

7 21 

11 

7 

93 

7 

8 9 

9 

8 

17 

6 

9 16 

7 

9 

2 

4 

H HHHH 

6 

H 

HHHHHH 

High  Low 

High 

Low 

256  -104 

322 

-181 

107  -142 

154 

-122 

107  -117 

153 

104  -114 

144 

124 

0 

112 

1 

K> 

"•J 

3* 

and  +26h  Hinges 

-13 

and 

+ 23h 

53h  Midspread 

36h 

-81 

and  +79  Side  Values 

— 49h 

and  60 

7 

and  7 Number  Outside 

11 

and 

14 

R.  D.  ARMSTRONG  AND  E.  L.  FROME 


93 


EXHIBIT  13.  Elementary  Analysis  l/sing  Pomedian  Procedure  on  the 
Nebraska  Voting  Data  (see  Tukey  [3,  Exhibit  10,  Chap.  19]). 


County 

Year 

Eff 

Fit 

•20 

’24 

*28 

32 

'36 

'40 

44 

48 

'52 

'56 

60 

64 

DO 

45 

91 

314 

147 

59 

-14 

-9 

-73 

-36 

-95 

-19 

8 

7 

356 

D1 

105 

75 

53 

149 

1 

-22 

-26 

-63 

14 

-2 

-27 

-8 

-83 

266 

B1 

-29 

26 

21 

0 

16 

9 

9 

-8 

0 

-3 

14 

-8 

16 

365 

D2 

62 

96 

-19 

-13 

3 

-25 

-25 

14 

-38 

-48 

-13 

2 

16 

365 

D4 

26 

-49 

-19 

8 

17 

20 

42 

78 

-3 

-6 

-41 

-22 

15 

364 

B4 

-9 

51 

-1 

-14 

-8 

2 

-2 

-9 

20 

24 

1 

49 

-20 

329 

D5 

-24 

-34 

-69 

-1 

-3 

26 

-26 

118 

8 

34 

26 

4 

-49 

300 

BS 

9 

13 

-1 

-2 

-13 

20 

33 

-3 

1 

17 

-19 

-31 

-40 

309 

D6 

0 

-28 

-67 

-20 

30 

0 

2 

41 

13 

0 

14 

43 

-36 

313 

B7 

-30 

-54 

135 

7 

-16 

-39 

-19 

0 

-16 

1 

71 

30 

53 

402 

D7 

-124 

-162 

-11 

24 

-45 

-2 

2 

55 

22 

11 

19 

-35 

93 

442 

Eff 

-48 

-89 

-81 

254 

129 

23 

-2 

54 

-133 

-94 

-101 

32 

349 

0 

Fit 

301 

260 

268 

603 

478 

372 

247 

403 

216 

255 

248 

381 

0 

-349 

analysis  is  essentially  exploratory  in  nature.  We  are,  on  the  one  hand,  prepared  to  obtain  evi- 
dence that  a simpler  model  may  adequately  fit  the  data.  At  the  same  time,  we  are  ready  to  look 
at  the  residuals  from  a robust  row-plus-column  fit  using  diagnostic  plots  or  other  techniques 
that  could  suggest  that  outliers  are  present,  or  that  an  improved  fit  may  be  obtained  if  we  add 
terms  to  the  model  or  reexpress  the  data  through  a transformation. 

6.  DISCUSSION 

It  is  generally  recognized  that  after  we  obtain  a simple  row-plus-column  fit  for  a two-way 
table  the  residuals  should  be  carefully  analyzed.  Gentleman  and  Wilk  [13]  have  considered  the 
efTect  of  one  or  two  outliers  superimposed  on  a basic  additive  model  with  independent  normal 
fluctuations,  with  mean  zero,  and  with  constant  variance.  Their  results  indicate  that  when  one 
outlier  is  present  the  judicious  use  of  half-normal  plotting  provides  a complete  basis  for  data- 
analytic  judgements.  They  further  find  that  direct  analysis  of  residuals  (from  a LSQ  fit)  is  not 
reliably  indicative  of  the  existence  of  peculiarities  when  two  outliers  are  present.  Gentleman 
and  Wilk  [14]  have  also  considered  the  problem  of  multiple  outliers,  and  they  have  proposed 
methods  for  identification  of  the  "K  most  likely  outlier  subset"  (where  the  maximum  possible 
value  of  K must  be  known).  Their  approach  considers  to  what  extent  a p-parameter  model 
analysis  can  be  statistically  improved  by  selective  reduction  in  the  size  (n)  of  the  data.  Their 
method  results  in  a LSQ  analysis  of  the  "good  data." 

In  many  situations  the  form  of  the  model  is  only  tentative,  and  a diagnostic  plot  of  residu- 
als is  required  [31].  The  diagnostic  plot  may  suggest  that  an  improved  fit  can  be  obtained  either 
by  adding  to  the  model  or  by  reexpressing  the  y' s.  When  there  are  several  possible  departures 
from  the  ideal  additive  model  for  a two-way  table,  the  importance  of  obtaining  a robust  fit  is 
increased  if  attempts  to  improve  upon  the  conventional  LSQ  analysis  are  to  be  successful. 
Thus,  as  McNeii  and  Tukey  [26]  have  shown,  it  is  possible  to  begin  with  a simple  row-pius- 
column  fit  of  a two-way  table  using  both  LSQ  and  LAV.  If  the  unknown  e0’s  follow  a Gaussian 
distribution,  then  we  expect  that  the  residuals  from  both  fits  should  appear  to  be  near  Gaussian, 
with  somewhat  less-stretched  tails  for  the  least-square  residuals.  If  the  e0's  are  from  a tail- 
stretched  distribution,  the  residuals  should  be  tail-stretched  — the  LSQ  residuals  much  less 
than  the  e,/s  and  the  LAV  residuals  slightly  more.  Tail-stretched  residual  distributions  may 
Also  be  the  result  of  an  inadequate  model.  Consequently,  if  the  LSQ  and  LAV  analyses  are 
clearly  different  then  further  careful  analysis  is  required.  It  is  important  to  note,  as  Mallows 


94 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


EXHIBIT  12.  Outside  Residuals  from  Row-Plus-Column  Fits  Using  LSQ 
(see  Exhibits  9 and  11)  and  LAD  (see  Exhibits  10  and  11) 


Tukey  [31,  Chap.  19]  obtained  a resistant  elementary  analysis  of  this  data  using  pomedian 
polishing  on  the  LSQ  analysis  (see  Exhibit  13).  The  pomedian  procedure  leads  to  residuals  that 
are  'nearly  balanced”  in  sign  in  each  row  and  column.  Note  that  the  median  of  each  row  and 
column  of  the  LAV  residuals  in  Exhibit  10  is  zero. 

5.4  Quality  of  Fit  for  Two-Way  Table 

In  a LSQ  analysis  of  a two-way  table  the  importance  of  the  row  and  column  effects  is 
measured  in  terms  of  the  decrease  in  the  sum  of  squares  that  occurs  when  the  row  (or  column) 
effects  are  included  in  the  model.  For  the  LAV  analysis  it  is  also  possible  to  obtain  an  indica- 
tion of  the  importance  of  the  row  and  column  effects.  First,  we  obtain  z (*i  *)  - £ £ 

i j 

M*l  “ 13661.  Next,  we  calculate  z(n*.  P*)  - 6282,  z(/a*.  r*)  - 12531,  and  z(fi,  r, 
p)  — 4240.  Then,  using  an  approach  suggested  by  McNeil  and  Tukey  [26],  we  determine  that 
the  column  fit  accounts  for  100[1  - (6282/13661)2]  - 78.9%  of  the  total  variation,  measured 
on  a size-squared  scale  in  terms  of  the  sum  of  the  absolute  deviations.  Similarly,  15.9%  of  the 
tptal  variation  is  explained  by  the  row  fit,  and  the  row-plus-column  fit  accounts  for  90.4%  of  the 
total  variation.  Thus,  we  are  able  to  conclude  that  the  size  of  the  residuals  is  considerably 
reduced  if  both  row  and  column  effects  are  included  in  the  model. 

As  is  the  case  in  an  unbalanced  LSQ  analysis,  the  reduction  in  the  objective  function  that 
occurs  when  additional  parameters  are  added  to  the  model  is  order  dependent.  Consequently, 
this  heuristic  approach  to  evaluation  of  the  relative  importance  of  a given  subset  of  parameters 
is  similar  to  the  use  of  r1  values  in  the  LSQ  analysis.  This  approach  is  suggested  when  the 


LEAST-ABSOLUTE-VALUE  ESTIMATORS 


95 


[25]  has  pointed  out,  that  our  understanding  of  robust  techniques  and  the  behavior  of  the  resi- 
duals that  they  generate  is  limited.  Certainly,  the  results  presented  in  this  section  indicate  that 
good  judgement  must  be  applied  by  the  data  analyst  to  obtain  a sensible  LAV  fit. 

REFERENCES 

[1]  Andrews,  D.F.,  "A  Robust  Method  for  Multiple  Linear  Regression,"  Technometrics  16, 

523-531  (1974). 

[2]  Armstrong,  R.D.,  and  E.L.  Frome,  "A  Comparison  of  Two  Algorithms  for  Absolute  Devi- 

ation Curve  Fitting,"  Journal  of  the  American  Statistical  Association,  71  (1976),  328- 
330. 

[3]  Armstrong,  R.D.,  and  J.W.  Hultz,  "A  Restricted  Discrete  Approximation  Problem  in  the 

Z-i  Norm,"  SIAM  Journal  on  Numerical  Analysis,  14  (1977),  555-565. 

[4]  Barrodale,  I.,  and  F.D.K.  Roberts,  "An  Improved  Algorithm  for  Discrete  Linear 

Approximation,”  SIAM  Journal  on  Numerical  Analysis  10,  839-848  (1973). 

[5]  Barrodale,  I.,  and  A.  Young,  "Algorithms  for  Best  L,  and  L„  Linear  Approximation  on  a 

Discrete  Set,"  Numerical  Mathematics  8,  295-306  (1966). 

[6]  Beaton,  A.E.,  and  Tukey,  J.W.,  "The  Fitting  of  Power  Series,  Meaning  Polynomials,  Illus- 

trated on  Band-Spectroscopic  Data,"  Technometrics  16,  147-185  (1974). 

[7]  Charnes,  A.,  "Optimality  and  Degeneracy  in  Linear  Programming,"  Econometrica  20,  160- 

170  (1952). 

[8]  Charnes,  A.,  and  W.W.  Cooper,  "Goal  Programming  and  Constrained  Regression  — A 

Comment,"  Omega  3 (No.  4),  403-409  (1975). 

[9]  Charnes,  A.,  and  W.W.  Cooper,  "Absolute  Deviations  and  Constrained  Regressions,"  pre- 

face to  Hilebrand  D.  translation  of  De  La  Valle  Poussin,  M.  Ch.  J.,  "On  the  Method  of 
Minimum  Approxiation,"  Annales  de  la  Societe  de  Bruxelles  35,  Part  II,  1-16  (1911). 
In  ONR  Research  Memorandum  96,  Camegie-Mellon  University,  Pittsburgh,  Pennsyl- 
vania (1964). 

[10]  Charnes,  A.,  and  W.W.  Cooper,  Management  Models  and  Industrial  Applications  of  Linear 

Programming,  Vols.  I and  II,  (Wiley,  New  York,  1961). 

[11]  Charnes,  A.,  W.W.  Cooper,  and  R.J.  Niehaus,  "Studies  in  Manpower  Planning,"  U.S.  Navy 

Office  of  Civilian  Manpower  Manaprment,  Washington,  D.C.  (1972). 

[12]  Charnes,  A.,  W.W.  Cooper,  and  R.  Ferguson,  "Optimal  Estimation  of  Executive 

Compensation  by  Linear  Programming,"  Management  Science  2,  138-151  (1955). 

[13]  Gentleman,  J.F.,  and  Wilk,  M.B.,  "Detecting  Outliers  in  a Two-Way  Table  1.  Statistical 

Behavior  of  Residuals,"  Technometrics  17,  1-14  (February  1975). 

[14]  Gentleman,  J.F.,  and  M.B.  Wilk,  "Detecting  Outliers,  II,  Supplementing  the  Direct 

Analysis  of  Residuals,"  Biometrics  31  387-410  (1975). 

[15]  Glover,  F.,  Karney,  D.,  and  Klingman,  D.,  "The  Augmented  Predecessor  Index  Method 

for  Locating  Stepping  Stone  Paths  and  Assigning  Dual  Prices  in  Distribution  Problems," 
Transportation  Science  6,  171-180  (1972). 

[161  Glover,  F.,  Karney,  D.,  Klingman,  D.,  and  Napier,  A.,  "A  Computational  Study  on  Start 
Procedures,  Basis  Change  Criteria,  and  Solution  Algorithms  for  Transportation  Prob- 
lems," Management  Science  20,  793-814  (1975). 

[17]  Glover,  F.,  D.  Klingman,  and  A.  Napier,  "An  Efficient  Dual  Approach  to  Network  Prob- 

lems," Opsearch  9,  1-19  (1972). 

[18]  Hadley,  G.,  Linear  Programming  (Addison-Wesley,  Reading,  Massachusetts,  1962). 

[19]  Hampel,  Frank  R.,  "A  General  Qualitative  Definition  of  Robustness,"  Annals  of 

Mathematical  Statistics  42,  1887-1896  (1971). 

[20]  Harter,  H.L.,  "The  Method  of  Least  Squares  and  Some  Alternatives  — Part  V,"  interna- 

tional Statistical  Review  43,  269-278  (1975). 

[21]  Hogg,  R.V.,  "Adaptive  Robust  Procedures;  A Partial  Review  and  Some  Suggestions  For 

Future  Applications  and  Theory,"  Journal  of  the  American  Statistical  Association  69, 
(December,  1974). 


R.  D.  ARMSTRONG  AND  E.  L.  FROME 


96 


122]  Hogg,  R.V.,  and  Randles,  R.H.,  "Adaption  Distribution  — Free  Regression  Methods  and 
Their  Application,"  Technometrics  17,  399-407  (1975). 

[23]  Huber,  P.J.,  "Robust  Statistics:  A Review,"  The  Annals  of  Mathematical  Statistics,  43, 

1041-1067  (1972). 

[24]  Lee,  S.M.,  Goal  Programming  for  Decision  Analysis  (Auerbach,  Philadelphia,  1972). 

[25]  Mallows,  C.L.,  "Discussion  of  Invited  Papers,"  Technometrics  16,  187-188  (1974). 

[26]  McNeill,  J.J.,  and  J.W.  Tukey,  "Higher-Order  Diagnosis  of  Two-Way  Tables,  Illustrated  on 

Two  Sets  of  Demographic  Empirical  Distribtuions,”  Biometrics  31,  487-510  (1975). 

[27]  Prescott,  P.,  "An  Approximate  Test  for  Outliers  in  Linear  Models,"  Technometrics  17, 

129-132  (1975). 

[28]  Robers,  P.D.,  and  A.  Ben-Israel,  "An  Interval  Programming  Algorithm  for  Discrete  Linear 

L\  Approximation  Problems,"  Journal  of  Approximation  Theory  2,  323-336  (1969). 

[29]  Searle,  S.R.,  Linear  Models  (Wiley,  New  York,  1971). 

[30]  Spyropoulos,  K.,  E.  Kiountouzis,  and  A.  Young,  "Discrete  Approximations  in  the  L{ 

Norm,"  The  Computer  Journal  16,  180-186  (1973). 

[31]  Tukey,  J.W.,  Exploratory  Data  Analysis  (Addison-Wesley,  Reading,  Massachusetts,  1977). 

[32]  Wagner,  H.M.,  "Linear  Programming  Techniques  for  Regression  Analysis,"  Journal  of  the 

American  Statistical  Association  54,  206-212  (1959). 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 
WITH  PROPORTIONAL  PRODUCT  DEMANDS 


F.  H.  Murphy 

Department  of  Energy 
Washington,  D.C. 


A.  L.  Soyster 

Virginia  Polytechnic  Institute  and  State  University 
Blacksburg,  Virginia 


ABSTRACT 

In  this  paper  we  consider  the  multiproduct,  multiperiod  production- 
scheduling model  of  Manne  under  the  assumption  that,  across  products, 
demands  are  interrelated  over  time.  When  demand  requirements  are  propor- 
tional we  show  that  the  solution  has  a specific  structure  determined  by  the  ratio 
of  setup  to  production-run  time  of  each  product.  This  structure  holds  for  any 
length  horizon  and  may  permit  a substantial  (time)  savings  fot  column  genera- 
tion solution  procedures. 

1.  PROBLEM  DEFINITION 

We  consider  a multiproduct,  multiperiod,  production-scheduling  problem  in  which 
demand  over  a finite  T-period  horizon  is  known  deterministically.  A firm  must  schedule  batch 
sizes  of  n products  for  a T-period  horizon.  Since  the  products  compete  for  limited  resources, 
this  situation  has  been  termed  the  capacitated  lot-size  problem.  A linear  programming  formula- 
tion due  to  Manne  [4]  has  been  a classic  reference  for  the  capacitated  problem.  This  linear  pro- 
gramming model  utilizes  decision  variables  which  represent  a class  of  dominant  sequences. 
One  generates  the  class  of  dominant  sequences  via  the  observation  that  if  {y,}, 
/ - 1,  2,  . . . , T,  represents  the  production  for  a particular  product  at  time  t and  I,  represents 
the  ending  inventory  of  the  product  for  period  t then  only  sequences  for  which  y,  • — 0 

need  be  considered.  For  example,  if  T — 3 and  the  demands  for  a particular  product  are  10, 
15,  and  5 units  for  periods  one,  two  and  three,  then  one  can  restrict  consideration  to  2r-:  — 4 
different  sequences.  These  four  viable  alternatives  would  be 
(30,  0,  0), 

(25,  0,  5), 

(10,  20,  0), 

(10,  15,  5), 

where  (30,  0,  0)  represents  a sequence  where  a setup  is  incurred  in  period  1 and  30  units  are 
processed;  in  periods  2 and  3 no  production  is  scheduled,  and  demand  is  satisfied  from  inven- 
tory. We  say  that  period  t ^ T is  a regeneration  point  if  I,  — 0. 


97 


98  F.  H.  MURPHY  AND  A.  L.  SOYSTER 

The  Manne  [4]  formulation  essentially  minimizes  the  amount  of  setup  time  required  in  a 
finite  horizon.  Let  a , represent  the  setup  time  for  product  / and  ft,  the  unit  processing  time. 
Manne  [4]  defines  the  coefficients  yijT  — hours  required  for  product  / in  period  r using 
sequence  j.  If  (1,  0,  1)  represents  sequence  j in  a three-period  horizon  for  which  setups  occur 
in  periods  one  and  three,  then 

yiji  = a,  + ft,  (rn  + ri2), 
yij2  “ 0, 
yiji  “ a,  + 6, 

where  riT  represents  the  requirements  (demand)  for  product  i in  period  r.  For  each  product 
one  must  choose  an  appropriate  sequence  within  the  constraints  of  available  hours  in  each 
period.  Let  ST  and  Vr  represent  respectively  the  amount  of  straight  time  and  overtime  avail- 
able in  period  r.  Define  xi}  as  the  proportion  of  demand  for  product  / that  is  satisfied  via  se- 
quence j and  I j the  amount  of  overtime  to  be  scheduled  in  period  t.  The  linear  program  of 
Manne  [4]  is 

(Manne)*  min  ^T’ 


(w,) 

(a) 

£ xu~  *> 

JtJ 

/ - 1,  2,  .. 

. , n, 

(uT) 

(ft) 

'LyiJrXlj  - It  < S T, 

r - 1,  2,  .. 

. . T, 

(zT) 

(c) 

/A 

r - 1,  2,  .. 

. , T, 

Xij,  1 T > 0. 

We  shall  associate  dual  variables  (w,,  uT,  zT)  along  with  constraint  categories  (a),  (b),  and  (c). 
Observe  that  the  number  of  structural  variables  would  be  nlT~ 1 + Tand  the  number  of  con- 
straints would  be  n +21 


One  difficulty  in  implementing  this  linear  programming  formulation  relates  to  its  size.  For 
large  T and  n,  several  thousand  columns  may  be  required.  A discussion  of  these  difficulties  is 
given  by  Kortanek,  Sodaro,  and  Soyster  [2].  Certain  advanced  programming  techniques  have 
been  formulated  so  that  not  all  of  the  alternative  sequences  need  to  be  explicitly  considered. 
Dzielinski  and  Gomory  [1]  and  Lasdon  and  Terjung  [3]  have  developed  column  generating  pro- 
cedures to  cope  with  the  size  complexities.  Kortanek,  Sodaro,  and  Soyster  [2]  and  May  [5] 
have  suggested  certain  simplifying  formulations  of  the  capacitated  lot-size  problem. 

The  column-generation  procedures  are  motivated  by  the  relative  ease  with  which  the 
uncapacitated  lot-size  problem  can  be  solved.  In  Wagner  and  Whitin  19]  and  Wagner  [7]  a 
highly  efficient  dynamic-programming  algorithm  is  developed  to  handle  the  uncapacitated  prob- 
lem. A detailed  discussion  of  these  techniques  is  given  in  Wagner  [8]. 

In  this  paper  we  consider  the  capacitated  lot-size  problem  for  which  product  demands 
have  a special  structure.  We  make  the  following  definition: 

DEFINITION:  Let  f|  - (r,,,  ru,  r31,  , rni)  be  the  vector  of  first  period  demands  for 

all  products.  Product  demand  is  said  to  be  intertemporally  proportional  if  there  exists  ar  > 0, 


'We  shall  take  S,  - 0 so  that  the  objective  function  is  equivalent  to  minimizing  total  setup  time. 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 


99 


r - 2 , 3 , T,  such  that  rr  - ar  ^ for  r - 2,  3,  ...  T,  where  rT  - (rlT,  r2„  rJr,  . . . ,r„T)  is 
the  vector  of  requirements  in  period  r. 

This  proportionality  was  observed  in  [2],  where  the  product  line  was  various  polyurethane 
parts  that  were  fabricated  for  the  interior  of  automobiles.  For  example,  left  and  right  arm  rests 
come  in  pairs,  and  for  each  pair  a dashboard  cover  was  fabricated.  One  should  expect  that  this 
is  often  the  case,  e.g.,  the  production  of  chairs  and  tables  is  dictated  by  complementary  consu- 
mer demands.  Products  with  several  components  and  subassemblies  would  mandate  certain 
proportional  demands,  at  least  in  the  long  run.  The  main  purpose  of  this  paper  is  to  character- 
ize the  form  of  the  optimal  solution  to  the  capacitated  lot-size  problem  when  the  product 
demand  remains  in  a fixed  proportion  from  period  to  period. 

When  the  demand  pattern  is  intertemporally  proportional  (or  proportional  for  short)  cer- 
tain surrogate  measures  of  scheduling  characteristics  are  also  maintained.  Suppose  the  n pro- 
ducts are  renumbered  so  that 

(1-1)  a,/(b,  r |,)  > aj(b2  r2I)  > ...  >aJ(bH  /•„,). 

1. e.,  product  1 has  the  largest  ratio  of  setup  to  first-period  processing  time.  Note  that  this  set  of 
inequalities  would  remain  intact  for  any  subhorizon  of  demand  if  the  demand  pattern  is  propor- 
tional. 

2.  CAPACITATED  LOT-SIZE  SOLUTIONS  WITH  PROPORTIONAL  DEMAND 

For  the  capacitated  lot-size  problem  in  a T-period  horizon,  one  specifies  the  demands  for 
each  product  / - 1,  2.  ....  h and  each  period  r - 1,  2,  ... , Ty  say  r,r  The  demand  require- 
ments are  conveniently  summarized  in  matrix  format  (Fig.  1).  The  assumption  that  the 
demand  is  intertemporally  proportional  means  that  each  row  in  the  above  matrix  is  a positive 
multiple  of  the  first  row.  We  assume  that  the  products  have  been  renumbered,  if  necessary,  so 
that  (1-1)  holds. 


Product 


1 

2 

3 

n 

1 

f11 

r21 

r31 

rn1 

Periods 

ro 

f12 

f22 

f32 

. . . 

rn2 

• 

• 

* 

• 

• 

T 

riT 

r2T 

f3T 

. . . 

fnT 

figure  i 


The  main  results  of  this  paper  concern  the  form  of  optimal  solutions  to  (Manne)  when 
the  demand  is  proportional,  and  they  concern  computational  shortcuts  based  on  the  form  of  the 
solutions.  An  optimal  solution  to  (Manne)  specifies  batch  sizes  or  production  levels  for  each 


100 


F.  H.  MURPHY  AND  A.  L.  SOYSTER 


product  in  each  period  of  the  T-period  horizon.  Consider  batch  sizes  for  each  product  in  period 
1.  If  riX  > 0 for  each  /,  then  each  product  must  be  set  up  in  period  1.  In  general,  the  first- 
period  batch  sizes  specified  by  an  optimal  solution  to  (Manne)  can  be  illustrated  as  shown  in 
Fig.  2.  The  shaded  areas  in  Fig.  2 represent  the  time  periods  of  demand  satisifed  by  batches 
scheduled  in  period  I for  the  4-period,  10-product  example.  In  Theorem  2 and  Corollary  2.1  it 
will  be  shown  that  when  (1-1)  holds  the  schematic  diagram  will  always  exhibit  a triangular 
shape  (Fig.  3). 


Figure  2 


Products 

123456789  10 


FlOURE  3 


DEFINITION:  A solution  to  (Manne)  is  triangular  if,  when  /(  < >2  and  lt  and  l2  are 
being  produced  in  the  same  period,  then  the  number  of  periods  of  demand  satisfied  by  the 
batch  for  product  /(  will  always  be  greater  than  or  equal  to  the  number  of  periods  of  demand 
satisfied  by  the  batch  for  product  /2. 

To  establish  the  existence  of  triangularity  we  need  the  following  theorems.  An  Important 
duality  result  of  (Manne)  deals  with  the  monotonicity  of  the  optimal  dual  variables  associated 
with  (b),  (urT  Manne  (4]  suggested  that  the  following  theorem  should  be  true.  We  include  a 
proof  of  this  theorem  in  the  appendix. 

THEOREM  1:  The  dual  variables  («3  are  nonincreasing  i.e.. 


T 


✓ 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 


101 


'j 


u‘T  > u’+l,  T - 1,  2 T - 1. 


This  is  an  intuitive  result,  because  anything  that  is  produced  in  a later  time  period  could 
be  produced  earlier  if  time  were  available.  This  makes  an  earlier  time  period  at  least  as  valu- 
able as  a later  time  period  since  there  is  no  holding  cost  for  inventory. 


Next,  we  review  a recent  result  [6]  concerning  the  sensitivity  of  the  unconstrained  lot-size 
problem.  Consider  a horizon  length  T and  let  rT  be  the  demand  for  a single  product  in  period 
r.  Assume  that  the  cost  for  a lot  yr  in  period  r is 

\cr+dTyT  if  yT  > 0, 

“ (0  otherwise, 

where  cT  is  the  cost  of  a setup  in  period  t and  dr  is  the  unit  production  cost  in  period  r.  The 
dynamic  lot-size  problem  seeks  the  vector  (yt,  y2,  . . . , _>y)  which  minimizes 

(2-2)  £ f(yT). 

T-l 


Let  /,  be  the  ending  inventory  in  period  t;  if  [yrT  , (O  is  an  optimal  solution  then  period  r is 
called  a regeneration  point  if  /j  0 (f  ^ T). 

Now  consider  two  alternative  setup  cost  structures, 

(i)  c,  > c2  > ...  > cT, 

(ii)  arc | > ac2  > ...  >acr, 

where  a € (0,  1).  In  both  cases  the  setup  costs  are  nonincreasing,  but  in  case  (ii)  the  setup 
costs  are  proportionately  reduced  by  the  same  factor  in  each  time  period. 

Consider  an  optimal  solution  for  (2-2)  when  the  setup  costs  are  given  by  (i).  An  optimal 
solution  is  conveniently  specified  via  the  set  of  regeneration  points  for  the  given  sequence.  For 

example,  by  </r,.  h2 hr)  we  mean  that  h\  is  the  period  in  which  the  first  regeneration 

point  occurs,  h2  the  second,  and  so  on.  In  this  case  r regeneration  points  occur.  (Note  that  the 
sequence  in  which  a single  setup  occurs,  i.e.,  the  initial  period,  has  no  regeneration  points,  r — 
0).  The  following  theorem  is  proven  in  [6]. 

THEOREM  2:  Assume  that  cT  > cT+|  and  dr  > dT+ ( for  r — 1,  2,  ... , T — 1,  and  let 
(A i , hj,  ... , /»,)  be  an  optimal  solution  to  (2-2)  for  setup  costs  given  by  (i).  Then  for  setup 
costs  given  by  (ii),  i.e.,  proportionally  smaller,  there  exists  an  optimal  solution  (fc)p  k2, 
... , kr)  with  the  following  properties:  r'  > rand  k,  < ATfor  r — 1,  2,  ... , r. 

The  essence  of  this  theorem  is  simple.  If  setup  costs  are  reduced  then  the  number  of 
regeneration  points  (and  hence  setups)  can  only  increase.  Furthermore,  since  kt  < A t,  the 
first-period  batch  size  will  not  increase  if  setup  costs  are  reduced.  In  general,  if  r is  a regenera- 
tion point  for  both  cost  structures,  then  the  lot  size  scheduled  in  period  r + 1 for  the  higher 
setup  costs  would  be  at  least  as  large  as  the  lot  size  in  r + 1 for  the  lower  setup  costs.  This  fol- 
lows since  r is  a regeneration  point,  so  that  the  optimal  solution  to  both  cost  structures  for  the 
horizon  (r  + 1,  T]  is  identical  to  the  corresponding  periods  in  the  optimal  solution  for  the  hor- 
izon (1,7']. 


102 


F.  H.  MURPHY  AND  A.  L.  SOYSTER 


Next  let  us  return  to  the  consideration  of  (Manne)  when  the  product  demand  is  intertem- 
porally  proportional.  Suppose  that  ( w i /*,  z‘)  is  an  optimal  dual  solution.  It  then  follows 
from  dual  feasibility  that  for  each  product  / 

(2-3)  £ yiJr  u’r  + w*  > 0,  € / 

T-l 


and,  moreover,  if  {jcJ}  is  an  optimal  primal  solution  and  xj(  > 0,  then  (2-3)  is  a strict  equality 
for  j - J,.  In  fact,  these  duality  considerations  are  often  used  to  sequentially  generate  candi- 
date sequences  when  the  set  J is  very  large  (1].  The  optimal  sequence  J (or  set  of  sequences) 
for  product  i can  be  obtained  by  minimizing  the  left-hand  side  of  (2-3).  For  a given  product  i, 
w,  is  a constant,  and  an  optimal  sequence  is  one  which  minimizes  (over  J € J)  the  quantity 

(2-4)  £ yIJr 

T-I 


The  general  form  of  (2-4)  for  some  arbitrary  sequency  J with  regeneration  points  (h\,  h2, 
... . hr)  is 


(2-5) 


f *1  ) 

*2 

k + */  £ r, , + «;  +, 

a,  + b,  £ r,t 

l T-l  J 

T-*|  + l 

. r 

T l 

+ ...  + uh+ , la 

/ + t>i  £ 'J- 

\ 

r-*,+l  J 

For  the  case  in  which  product  demand  is  proportional,  i.e.,  riT  - or  r„,  (2-5)  can  be  written  as 


(2-6) 


«i  k + */'',  £ «r 


T-l 


or  equivalently, 
(2-7)  b,  r, 


(«/  + b,r,  £ aT| 

l J 

• • • + «a,+i  U + r,  £ oJ, 

[ r-*f+l  ) 


Vl+ <f|  £ar  + C,»1+|  + <41+|  ^ a. 


T-l 


+ •••+  k*f+, + d*f+l  £ or, 

l T“*r+I 

U a,  # 

where  ch  - — and  dr  - u'r  Observe,  then,  that  c„  > c,T+,  and  d,  > d,+!,  since  > 

# ri 

«,+ |.  Since  (2-7)  is  the  general  form  of  (2-4)  for  each  product  i and  sequence  y,  the  following 
corollary  is  obtained: 


COROLLARY  2.1  (Triangularity):  Assume  that  the  products  are  numbered  so  that 
ajbx  rtt  > aj/b2  rM  > ...  > ajbn  r„ , and  that  the  demand  is  intertemporally  proportional 
throughout  the  T-period  horizon.  There  exists  an  optimal  basic  solution  (xj)  to  (Manne)  with 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 


103 


the  following  property.  Let  /t  < /'2,  and  suppose  that  x,’;)  is  basic.  If  products  i\  and  i2  are 
produced  in  period  p and  product  i\  has  its  first  regeneration  point  after  p in  period  p\,  then,  for 
any  sequence  j2  for  which  x,jyj  is  basic,  a first  regeneration  after  p exists  at  />,  or  earlier. 

PROOF:  The  proof  follows  directly  from  Theorem  2 and  (2-7).  Observe  that  if  /,  < i2 

then 


and  dr  is  the  same  for  both  products.  Since  |c/|T),  {c/jT},  and  {</,)  are  nonincreasing  in  t,  the 
corollary  is  a consequence  of  Theorem  2 and  of  the  fact  that  minimization  of  (2-7)  yields 
optimal  sequences  Jx  and  j2  for  products  /,  and  /2.  QED. 

The  corollary  characterizes  the  form  of  optimal  solutions  to  (Manne)  when  the  product 
demand  is  proportional.  In  particular,  the  first-period  batch  sizes  vary  in  accordance  with  the 
ratio  of  setup  to  run  time.  A product  with  a large  ratio  of  setup  to  run  time  should  have  a 
first-period  batch  size  that  is  greater  than  or  equal  to  the  batch  size  for  a product  with  a smaller 
ratio  of  setup  to  run  time.  This  corollary  also  corrects  an  improper  assertion  made  in  (2).  In 
(2)  it  was  erroneously  asserted  that,  when  demand  is  proportional,  before  any  demand  in  excess 
of  rtjl  is  scheduled  in  period  1 for  product  /2,  all  demand  for  product  /t  would  be  scheduled  in 
period  1,  /t  < /2, 

COROLLARY  2.2  (Monotone  Set-ups):  Assume  that  the  products  are  numbered  so  that 
a\lb\rn  > a-Jb1r-n  > ...an/bnr„{  and  that  the  demand  is  intertemporally  proportional 
throughout  the  T-period  horizon.  There  exits  an  optimal  basic  solution  (xj)  to  (Manne)  with 
the  following  property.  Lei  i\  < i2  and  suppose  that  x,|7|  is  basic.  There  exists  a sequence  J2 
with  at  least  as  many  setups  as  jx  for  which  x/j7j  is  basic. 


PROOF:  The  result  follows  directly  from  Theorem  2 in  a manner  analogous  to  the  proof 
of  Corollary  2.1. 


3.  EXAMPLES  AND  USES  OF  TRIANGULARITY 

For  the  case  in  which  demand  is  intertemporally  proportional,  the  optimal  basic  solutions 
to  (Manne)  exhibit  a regular  pattern  as  illustrated  by  Fig.  3.  An  algebraic  interpretation  exists 
in  terms  of  the  set  of  alternative  production  sequences.  This  is  illustrated  for  a four-period 
horizon.  For  a horizon  of  T «■  4 there  exist  eight  alternative  sequences,  which  we  number  as 


follows: 

1 

1 

1 

A 

1 

A 

2 

1 

1 

1 

1 

1 

1 

1 

1 

1 

0 

0 

0 

0 

1 

1 

1 

1 

0 

0 

1 

1 

0 

0 

1 

1 

0 

1 

0 

1 

0 

l 

0 

1 

Observe  that  the  sequences  are  numbered  in  order  of  decreasing  first  regeneration  pojnt,  if  a tie 
exists  then  according  to  decreasing  order  or  second  regeneration  point,  and  so  on.  An  implica- 
tion of  Corollary  2.1  is  that  if  x'fJ  is  basic  for  J « 5,  6,  7,  or  8,  and  /|  < /2,  then  x'^  is  not 
basic  for  J - 1,  2,  3,  or  4. 


104 


F.  H.  MURPHY  AND  A.  L.  SOYSTER 


TABLE  1 


Products 

(hr) 

b 

i 

(hr) 

Demand 

1 

2 

3 

4 

1 

9 

0.1 

21  " 

35 

56 

42 

2 

4 

0.2 

6 

10 

16 

12 

3 

8 

0.4 

12 

20 

32 

24 

4 

4 

0.5 

6 

10 

16 

12 

5 

8 

0.7 

12 

20 

32 

24 

6 

8 

0.8 

15 

25 

40 

30 

7 

3 

0.6 

9 

15 

24 

18 

8 

3 

0.5 

27 

45 

72 

54 

9 

1 

0.5 

12 

20 

32 

24 

10 

1 

0.7 

24 

40 

32 

48 

Actually,  a much  stronger  result  is  implied  by  Corollary  2.1.  If  ix  < i2  and  jc,*  is  basic, 
then  for  any  basic  it  follows,  according  to  the  given  numbering  system,  that  J2  > Jx.  Con- 
sider the  example  with  ten  products  in  Table  1.  For  resource  availabilities  let  K,  — 200,  V2  m 
150,  V}  - 150,  V4  - 150,  and  all  Sr  - 0.  The  (unique)  optimal  basic  solution  to  (Manne)  pro- 
duces the  following  sequences: 


Product  Sequence 

1 1 


3 1 

4 1 

5 1,3 

6 3,5 

7 5,6 

8 6 

9 6 


Corollary  2.2  further  constrains  the  allowable  occurrences  of  the  alternative  sequences  for 
proportional  demand.  Note  that  sequence  4hua  first  regeneration  point  (period  2)  later  than 
sequence  5 (period  1),  but  sequence  4 contains  three  setups  and  sequence  5 only  two  setups.  If  a 

/i  < l2  and  X/*4  is  basic,  then  an  optimal  basic  solution  with  x'^  > 0 exists  for  J — 6,  7,  or  8. 

Taken  together.  Corollaries  2.1  and  2.2  imply  that  in  a given  program  an  optimal  solution  exists 
such  that  sequences  4 and  5 may  not  occur  with  different  products.  At  most  they  can  occur 
simultaneously  for  one  single  product.  For  example,  if  i\  < /j,  then  x’^  and  x'^  both  basic 
violates  Corollary  2.2,  and  x'^  and  x,‘4  both  basic  violates  Corollary  2.1. 

One  of  the  most  efficient  approaches  to  solving  (Manne)  is  to  combine  column  generation 
with  generalized  upper  bounding.  That  is,  for  each  product  use  the  dual  variables  as  an  oppor- 
tunity cost  for  time,  u,  a,  for  the  setup  cost  for  product  I in  period  r,  and  wr  b,  for  the  produc- 
tion cost  per  unit  in  period  r.  Then  for  each  product  solve  the  Wagner-Whitin  problem  to 


t 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 


105 


find  the  lowest-cost  production  sequence.  This  finds  the  production  sequence  for  each  product 
with  the  lowest  reduced  cost  in  the  linear  programming  sense  (see  Dzielinski  and  Gomory  [1]). 
After  the  Wagner-Whitin  problem  is  solved  for  each  product,  the  sequence  with  the  lowest 
reduced  cost  is  pivoted  into  the  basis.  The  column-generation  linear  programming  iterations 
are  continued  until  the  products  with  the  lowest  reduced  costs  are  already  in  the  basis  or  would 
not  improve  the  solution  when  pivoted  into  the  basis,  i.e.,  a reduced  cost  of  zero.  Now  sup- 
pose that  this  column-generation  approach  is  applied  to  the  numerical  example  of  this  section 
and  that  the  foregoing  optimal  solution  is  obtained  at  some  iteration.  For  illustrative  purposes 
suppose  that  this  feasible  solution  is  not  optimal  and  subsequent  iterations  are  required.  How 
can  triangularity  simplify  the  column-generation  procedure?  If  at  this  current  iteration  it  is  true 
that  («,}  are  nonincreasing,  then  according  to  Corollaries  2.1  and  2.2  only  a small  number  of 
products  and  sequences  need  to  be  priced  for  possible  entry  into  the  next  linear  programming 
solution.  In  particular,  Corollaries  2.1  and  2.2  state  that  those  sequences  which  maintain  tri- 
angularity will  always  provide  a lower  reduced  cost.  For  the  example  problem,  one  should  only 
price  sequence  2 for  product  5 and  sequence  7 for  products  9 and  10.  One  would  not  need  to 
price  sequences  for  the  other  products.  The  product-sequence  combination  with  the  lowest 
reduced  cost  would  be  among  this  set  of  three  alternatives.  The  column-generation  routine 
could  be  structured  to  eliminate  the  generation  of  all  sequences  for  a large  number  of  products 
and  to  reduce  significantly  the  number  of  alternative  sequences  for  the  remaining  products. 

The  impact  of  this  result  upon  column-generation  procedures  for  (Manne)  has  not  been 
tested;  increased  efficiency  will  depend  upon  the  form  of  intermediate  solutions,  i.e.,  at  what 
point  intermediate  solutions  provide  monotone  nonincreasing  dual  variables  and  triangular-type 
feasible  solutions.  However,  since  at  optimality  the  dual  variables  must  be  monotone  nonin- 
creasing, adding  constraints  to  the  dual  of  (Manne),  i.e.,  columns  to  the  primal,  which  enforce 
monotonicity  does  not  affect  the  optimal  solution.  The  form  of  these  constraints  in  the  dual 
would  be  simply  uT  > wT+)  for  r — 1,  2,  ....  7*  - 1.  In  this  manner,  the  monotonicity 
requirement  is  maintained  at  each  iteration.  The  trade-off  here  is  that  if  one  or  more  of  these 
T - 1 artificial  columns  is  positive  at  some  intermediate  iteration  then  the  current  solution  is 
not  feasible  for  (Manne).  The  overall  efficiency  gained  by  employing  these  triangularity  results 
will  be  explored  in  a subsequent  paper. 

4.  CONCLUSION 

Given  proportional  product  demands,  we  have  shown  that  a triangularity  property  exists 
which  greatly  simplifies  the  computational  needs  for  solving  Manne’s  production-scheduling 
model.  These  results  are  useful  under  certain  circumstances  when  product  proportionality  is 
not  met.  First,  in  the  illustrations  we  have  assumed  that  all  products  had  requirements  to  be 
met  in  every  period.  In  reality  there  is  inventory  to  meet  early  requirements  for  many  pro- 
ducts. Therefore,  the  requirements  to  be  met  from  production  are  not  proportional.  But 
Corollary  2.1  shows  that  triangularity  holds  from  the  point  of  first  production  onward.  The  tri- 
angular structure  for  period  r production  is  for  products  with  requirements  in  period  r not  met 
by  inventory. 

Second,  since  demand  proportionality  is  not  a requirement  for  monotonicalty  nonincreas- 
ing dual  variables,  if  there  is  a subset  of  products  with  demand  proportionality  within  a larger 
set  of  products,  triangularity  holds  for  the  subset.  There  may  be  many  subsets  having  demand 
proportionality  within  the  subsets.  Again,  triangularity  and  the  column-generation  reduction 
procedures  hold  within  the  subsets. 


106 


F.  H.  MURPHY  AND  A.  L.  SOYSTER 


REFERENCES 

[1]  Dzielinski,  B.  P.,  and  R.  E.  Gomory,  "Optimal  Programming  of  Lot  Sizes,  Inventory  and 
Labor  Allocations,"  Management  Science  11,  874-890  (1965). 

[2J  Kortanek,  K.  O.,  D.  Sodaro,  and  A.  L.  Soyster,  "Multi-Product  Production  Scheduling  via 
Extreme  Point  Properties  of  Linear  Programming,"  Naval  Research  Logistics  Quarterly  15, 
287-300  (1968). 

[3]  Lasdon,  L.  S.,  and  R.  C.  Terjung,  "An  Efficient  Algorithm  for  Multi-Item  Scheduling," 
Operations  Research  19,  946-969  (1971). 

[4]  Manne,  A.  S.,  "Programming  of  Economic  Lot  Sizes,"  Management  Science  4,  115-135 
(1958). 

[5]  May,  J.  G.,  "A  Linear  Program  for  Economic  Lot  Sizes  Using  Labor  Priorities,”  Manage- 
ment Science  21,  277-285  (1974). 

[6]  Murphy,  F.  H.,  and  A.  L.  Soyster,  "Sensitivity  Analysis  of  the  Cost  Parameters  in  the 
, Dynamic  Lot  Size  Model,"  Working  paper,  Virginia  Polytechnic  Institute  and  State  Univer- 
sity, Blacksburg,  Virginia  (1977). 

[7]  Wagner,  H.  M.,  "A  Postscript  to  Dynamic  Problems  in  the  Theory  of  the  Firm,"  Naval 
Research  Logistics  Quarterly  7,  7-12  (1960). 

[8]  Wagner,  H.  M.,  Principles  of  Operations  Research  (Prentice-Hall,  Englewood  Cliffs,  N.J., 
1969). 

191  Wagner,  H.  M.,  and  T.  M.  Whitin,  "A  Dynamic  Version  of  the  Economic  Lot  Size  Model," 
Management  Science  5,  89-96  (1958). 

APPENDIX 

THEOREM  1:  The  dual  variables  («')  are  nonincreasing,  i.e., 
u‘  > i/T*+1,  r - 1,  2 T - 1. 


PROOF:  We  flrst  show  that  uj  > uj  and  then  use  an  induction  step  to  complete  the 

proof. 

To  show  that  uj  > uj,  first  note  that  if  no  production  is  scheduled  in  period  2 then  uj  - 
0.  In  this  case  obviously  uj  > uj.  Next,  suppose  that  production  for  some  product  / is 
scheduled  in  period  2,  i.e.,  xj(  > 0 for  some  sequence  J\.  Consider  the  sequence  J2  that 

differs  from  J\  in  that  no  setup  is  specified  for  period  2.  Using  complementary  slackness  and 
dual  feasibility  of  problem  (Manne)  one  obtains 

(Al)  £ Y/y, T«;  + W/'-0 

r-l 


(A2)  «r*  + "i*  > 0. 

r-l 

Subtracting  (Al)  from  (A2),  we  obtain 

£ <Y/y,r  ~ Y/y,r)  «,*  > 0. 
r-l 

and  since,  yIJjT  - yijx,  for  r - 3,  4,  ...  , T,  it  follows  from  (A3)  that 

£ (Y/y,r  — Y/y,r)  “r  ^ 0. 
r-l 


MULTIPRODUCT  LOT-SIZE  SCHEDULING 


107 


Now  let  t',  2 < t'  < T,  be  the  latest-period  demand  that  is  scheduled  in  period  2 accord- 
ing to  sequence  j\  (which  is  also  the  latest-period  demand  scheduled  in  period  1 for  sequence 
J\).  From  (A4)  it  follows  that 


(A5) 


Kfl/  + ft,  £ r,T)  - (a,  + ft,  r,,)]  «,* 


r-l 


- («,  + ft,  £ r,T)  u2  > 0. 

T— I 

which,  after  simplification,  yields 

(A6)  («,'  - u2)  ft , £ rlT  > u2  a,. 

r-2 

Since  a,  > 0,  then  (A6)  implies  u\  > «2. 

Now  suppose  that  «,*  > h2  > ...  > «*.  We  will  show  that  u*  > w*+1.  Again,  if  no  pro- 
duction is  scheduled  in  period  r + 1,  then  u*  + | — 0 and  the  conclusion  is  clear.  Hence,  sup- 
pose that  production  for  some  product  / is  scheduled  in  period  r + 1,  i.e.,  x,lt  > for  some 
sequence  j\ . We  need  to  consider  two  cases: 

(1)  Sequence  Jx  specifies  a setup  in  period  r. 

(2)  Sequence  j\  specifies  no  setup  in  period  r. 

For  case  (1)  consider  the  sequence  J2  that  differs  from  j x only  in  that  there  is  no  setup  in 
period  r + 1.  As  before,  it  follows  from  complementary  slackness  and  dual  feasibility  that 


(A8) 


l 

Z (r/yjr  - y,;,r)  «T  > 0. 


r-l 


Since  sequences  Jx  and  J2  differ  only  in  periods  r and  r + 1,  (A8)  implies 

(A9)  (y,J}T  - yij,r)  Mr*  + (y,y2.r+l  “ V^.r+l)  «r*  > 0. 

If  r',  t + 1 < r'  < T,  is  the  latest-period  demand  scheduled  in  period  r + 1 for  sequence  jx 
(which  is  also  the  latest-period  demand  scheduled  in  period  r for  sequence  j2),  then  from  (A9) 
one  obtains 


(A  10) 

Rearranging  (A10)  we  obtain 
(All) 

which  implies  that  u‘  > «*+ t. 


(ft,  Z «r  - («,  + ft,  i r„)  «;+,  > 0. 

'-r+l  ,-r+l 


(«r-«r'+|)ft,  Z rH  > “r+l  <*!. 

,-T+l 


For  case  (2)  let  sequence  J2  be  defined  as  in  case  (1),  i.e.,  sequence  J2  has  a setup  in 
period  r,  but  no  setup  in  period  r + 1.  Let  f < r be  the  period  for  which  period  r demand  is 
scheduled  according  to  sequence  Jx.  Again,  it  follows  that 


(A  12) 


£ lyij, r - ?</, r)  «T  > 0, 


r-l 


lv- 


K 


108 


F.  H.  MURPHY  AND  A.  L.  SOYSTER 


which  simplifies  to 

(A13)  (yIJjt  - u’.  + (y^T  - y ,/|T)  “r 

+ (y</,.r+l  “ Vv,. r+l)  "r+l  > 0. 

Substitution  for  the  three  terms  in  (A13)  yields 
(A14)  -bt  rlT  u\  + («/  + L r^u' 

l-T 

+ (— a,  — b;  X r/()  uT+i  > 0, 

I-  T+l 

where  r'  remains  as  previously  defined.  Next  we  use  the  induction  hypothesis,  u-  > uT,  in 
(A  14),  which  yields 

(A15)  (a,  + b £ r«)  «r  > (a,  + *,  £ «r\i 

f-r+l  »-r+l 

Now  case  (2)  follows  from  (A1S),  and  the  theorem  is  proven. 


) 


AN  EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  THE 
QUADRATIC-ASSIGNMENT  PROBLEM 


M.  S.  Bazaraa* 

School  of  Industrial  and  Systems  Engineering 
Georgia  Institute  of  Techology 
Atlanta,  Georgia 

A.  N.  Elshsfeif 

Institute  of  National  Planning 
Cairo,  Egypt 


ABSTRACT 

The  quadratic-assignment  problem  is  a difficult  combinatorial  problem 
which  still  remains  unsolved.  In  this  study,  an  exact  branch-and-bound  pro- 
cedure, which  is  able  to  produce  optimal  solutions  for  problems  with  twelve  fa- 
cilities or  less,  is  developed.  The  method  incorporates  the  concept  of  stepped 
fathoming  to  reduce  the  effort  expended  in  searching  the  deciaion  trees.  Com- 
putational experience  with  the  procedure  is  presented. 


1.  INTRODUCTION 


The  quadratic-assignment  problem  is  a combinatorial  problem  that  has  been  of  great 
interest  to  many  researchers.  The  problem  can  be  stated  as  follows: 


Minimize  £ £ £ £ cUkl  xiJ  xki  + TL  £ ft!  XiJ 


l-l  J- 1 *-l  /-I 


l-l  J- 1 


£ xu  “ L 

/ — 1.  .. 

• . m, 

j- 1 

m 

<-i 

J-l.  • 

• . m, 

xu  is  0 or  1, 

C 

■ 

, m. 

The  problem  can  be  interpreted  as  follows.  We  suppose  that  m facilities  or  objects  are  to 
be  assigned  to  m locations.  Here,  xtJ  is  1 if  facility  / is  placed  in  location  J and  is  0 otherwise. 
The  quantity  cIJkl  is  the  cost  of  the  mutual  assigment  of  object  i to  location  J and  object  k to 
location  l,  and  it  is  usually  determined  as  the  number  of  interactions  ulk  between  objects  I and  k 
weighted  by  the  distance  from  location  j to  location  I,  that  is,  cIJk,  - u,kdj  Furthermore,  fu  is 
the  fixed  cost  of  assigning  facility  / to  location  J. 


’This  author’s  work  was  supported  by  NSF  grant  #GK-38337. 
tWork  done  while  vistins  North  Carolina  State  University. 

109 


no 


M.  S.  BAZARAA  AND  A.  N.  ELSHAFEI 


Since  Koopmans  and  Beckman  [10]  introduced  the  quauratic-assignment  problem  in  the 
context  of  locating  indivisible  objects,  the  problem  has  gained  a great  deal  of  popularity  among 
researchers,  due  mostly  to  its  wide  range  of  applications.  In  [16],  Whitehead  and  Elders  dis- 
cussed the  use  of  the  problem  in  the  area  of  building  layout.  In  [4],  Elshafei  described  a 
quadratic-assignment  algorithm  that  can  be  used  in  the  context  of  hospital  layout.  The  problem 
has  also  been  used  in  the  fields  of  urban  planning,  control-panel  layout,  and  wiring  design  in 
the  placement  of  electronic  components  in  an  assembly.  For  details  on  these  applications,  the 
reader  may  refer  to  Hopkins  [9],  Dorris  [3],  Breuer  12],  Gaschutz  and  Ahrens  [5],  and  Stein- 
berg [15]. 

Various  procedures  for  the  solution  of  the  problem  have  been  suggested  in  the  literature, 
including  both  exact  and  heuristic  procedures.  This  study  concerns  itself  with  exact  methods. 
Currently,  the  available  exact  procedures  are  all  of  the  branch-and-bound  type  and  can  be 
classified  into  single-assignment  algorithms,  pair-assignment  algorithms,  and  pair-exclusion 
algorithms.  Single-assignment  algorithms  proceed  by  the  assignment  of  one  unassigned  facility 
to  a vacant  location  at  any  stage  of  the  search  process.  The  procedures  of  Gilmore  [7],  Graves 
and  Whinston  [8],  and  Lawler  [12]  fall  in  this  class.  Neither  Gilmore  nor  Lawler  reported  any 
computational  experience.  Graves  and  Whinston  compared  their  procedure  with  some  existing 
heuristics  and  provided  better-quality  solutions,  but  they  did  not  guarantee  optimality.  Pair- 
assignment  methods  proceed  by  simultaneous  location  of  two  facilities  at  two  unoccupied  loca- 
tions. In  [11],  Land  described  a pair-assignment  algorithm  that  first  reduces  the  cost  matrix  so 
that  it  contains  a zero  in  each  row  and  a zero  in  each  column.  Gavett  and  Plyter  [6]  extended 
the  method  of  Land  by  tightening  the  computation  of  the  lower  bounds.  They  reported  that 
their  algorithm  took  14  min  on  an  IBM  7044  machine  to  solve  a problem  of  size  m — 7 and  42 
min  for  m — 8.  Pair-exclusion  algorithms  proceed  on  the  basis  of  a stage-by-stage  exclusion  of 
assignments  from  a solution  to  the  problem.  In  [14],  Pierce  and  Crowston  reported  the  results 
for  this  procedure  for  a problem  of  size  four  facilities 

In  this  study,  we  discuss  an  exact  branch-and-bornd  scheme  for  solving  the  quadratic- 
assignment  problem.  The  method  is  similar  to  that  suggested  by  Gilmore  [7],  but  it  differs  in 
the  computation  of  the  lower  bounds  and  in  the  branching  rules.  It  also  incorporates  the  con- 
cept of  stepped  fathoming  given  in  [1]  for  speeding  the  search  of  the  decision  tree.  The 
reported  algorithm  was  able  to  find  the  optimal  solution  of  a problem  of  size  12  facilities  but 
failed  to  produce  exact  solutions  for  problems  of  size  m > 15. 

2.  AN  EXACT-SOLUTION  PROCEDURE 

In  this  section  we  describe  a branch-and-bound  solution  procedure  which  can  ised  to 
obtain  optimal  and  qualified  suboptimal  solutions.  The  following  notation  will  be  ua«d.  The 
location  to  which  object  i is  assigned  is  denoted  by  /*.  Hence,  the  mutual  cost  of  assigning 
objects  / and  J is  u^d^  + ujfdj;»  and  the  fixed  cost  of  these  assignments  is  /„•  + /yy. 

Decision  Tree  end  General  Framework 

At  each  stage  of  the  algorithm,  we  have  a set  of  objects  that  has  already  been  assigned  to 
certain  locations.  These  already  assigned  objects  form  a partial  solution  of  the  assignment  prob- 
lem. In  order  to  obtain  a feasible  solution,  that  is,  a complete  assignment,  we  must  find  a com- 
pletion of  the  partial  solution.  Rather  than  considering  all  the  possible  ways  of  completing  the 
partial  solution,  we  first  investigate  whether  the  partial  solution  might  lead  to  a complete  solu- 
tion with  an  objective  value  smaller  than  the  best  solution  that  we  already  have.  This  is  done 
by  calculating  a lower  bound  on  the  cost  of  completing  this  partial  solution. 


EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  QUADRATIC-ASSIGNMENT  PROBLEM  111' 

Let  A be  the  lower  bound  and  let  C*  be  the  cost  of  the  best  available  assignment.  If 
B > C*  then  any  completion  of  the  partial  solution  can  lead  to  no  improvement.  In  this  case, 
the  partial  solution  is  said  to  be  fathomed,  and  it  is  abandoned.  On  the  other  hand,  if  B < C*  it 
is  worthwhile  for  us  to  pursue  the  partial  solution  by  seeking  to  assign  more  objects. 

Calculation  of  the  Lower  Bound 

Suppose  that  a set  of  objects  indexed  by  the  set  I has  already  been  assigned  to  a set  of 
locations  indexed  by  the  set  J.  In  particular,  suppose  that  object  / is  assigned  to  location  /*.  A 
lower  bound  B on  the  cost  of  this  partial  solution  and  its  completion  is  computed  as 
B - C\  + C2  + C3,  where 

Ci  — cost  of  the  partial  assignment; 

C2  “ lower  bound  on  the  cost  of  interaction  between  assigned  objects  and  unassigned 
objects  plus  the  fixed  cost  of  locating  the  unassigned  objects; 

Cj  — lower  bound  on  the  cost  of  interaction  among  the  unassigned  objects  themselves. 


Note  that  C j is  given  by 

” £ /«•  + 2*  uij  di‘j- 
1(1  I.J(I 

Here  C2  is  the  optimal  cost  of  the  following  linear -assignment  problem: 

Minimize  J £ b^y 
utjij 

subject  to  £ - 1 for  i 4 I, 

jtJ 

- 1 for  J 4 J, 

HI 

x,j  >0  for  / 4 I,  J 4 J, 

where  by  is  a bound  on  the  cost  resulting  from  the  assignment  of  object  / to  location  J.  For 
example,  we  can  use 

by  m fy  + £ djt,  + U„  d,»j) . 

1(1 

Of  course,  a complete  solution  of  the  linear-assignment  problem  can  be  replaced  by  the  simpler 
task  of  reduction  of  the  cost  matrix  ( by ) such  that  it  has  at  least  one  zero  in  each  row  and  each 
column  by  subtraction  of  the  minima  of  the  rows  from  the  rows,  and  the  minima  of  the 
columns  from  the  resultant  columns. 

Two  methods  of  computation  of  C3  are  available.  The  first  method  relies  on  the  ranking 
of  the  interactions  and  distances  as  follows.  Rank  the  interactions  utJ  in  a descending  order  for 
i,  J 4 I,  and  rank  the  distances  dy  in  an  ascending  order  for  I,  J 4 J.  This  results  in  an 
ordered  interaction  vector  and  an  ordered  distance  vector.  Then  Cj  is  the  inner  product  of 
these  two  vectors.  In  other  words,  we  calculated  C3  by  matching  the  largest  interaction  among 
unassigned  elements  to  the  smallest  distance  between  unassigned  locations,  the  second  largest 
interaction  to  the  second  smallest  distance,  and  so  forth.  Clearly,  this  procedure  will  give  a 
lower  bound  on  the  cost  among  unassigned  elements. 

An  alternative  method  for  finding  a suitable  bound  C3  is  the  solution  of  a linear  assign- 
ment problem  whose  cost  matrix  is  constructed  as  follows.  For  each  unlocated  element  /,  rank 


112 


M.  S.  BAZARAA  AND  A.  N.  ELSHAFEI 


I I 

i 


I 


the  interactions  between  it  and  all  other  unlocated  elements  in  descending  order.  Similarly,  for 
each  vacant  location  j,  rank  the  distances  between  it  and  all  other  vacant  locations  in  ascending 
order.  Then  a lower  bound  etJ  on  the  cost  of  locating  facility  i in  location  j is  the  inner  product 
of  the  above  two  vectors.  Thus,  we  find  C3  by  solving  the  following  linear-assignment  prob- 
lem: 


Minimize  £ £ e,7x,7 
mm 


subject  to  T xu  - 1 

for  / € 1, 

j*j 

1 

for  j € J, 

HI 

x,j  > o 

for  i 4 I,  j i J. 

Needless  to  say,  the  above  assignment  problem  can  be  combined  with  the  assignment  problem 
in  the  C2  calculation  to  give  C2  + C3.  The  overall  lower  bound  B - C\  + C2  + Cj  is  now 
available. 

Contlnnatlon  of  the  Search:  Fathoming  (Backward  Move) 


Suppose  that  k objects  indexed  by  the  set  I have  already  been  assigned  to  k locations 
indexed  by  the  set  J.  The  level  of  the  search  tree  is  called  k.  A bound  on  the  cost  that  results 
from  all  completions  of  the  current  partial  solution  B is  calculated  as  discussed  above.  If 
B > C*,  where  C*  is  the  best  known  cost  of  a complete  assignment,  then  the  partial  solution  is 
fathomed.  The  last  assignment,  that  is,  the  fcth  assignment,  is  banned  or  prohibited  in  the  hope 
that  this  will  lead  to  an  improved  completion.  For  example,  if  the  kth  assignment  involves 
placing  object  ik  in  location  /*  then  x,^.  is  forced  to  be  zero.  Here,  ik  is  placed  in  the  list  of 
unassigned  objects,  that  is,  ik  is  removed  from  /,  and  similarly  ik  is  removed  from  the  list  of 
unassigned  locations,  that  is,  ik  is  removed  from  J.  We  calculate  a new  bound  B'  in  exactly  the 
same  manner  as  explained  above,  except,  of  course,  that  the  assignment  — 1 is  prohibited, 

by  forcing  » ,.■«  while  we  solve  the  linear-assignment  problem.  If  fl'is  still  > C*,  then  the 

k k 

partial  solution  of  the  first  k - 1 assignments,  while  banning  the  assignment  ik  to  /*,  can  still 
lead  to  no  improved  solutions.  Since  the  first  k - 1 assignments  with  x . — 1 and  x^..  — 0 

lead  to  no  improvement,  then  all  the  possibilities  at  level  k have  been  exhausted,  and  prohibi- 
tion of  the  assignment  at  level  k - 1 is  now  possible.  This  condition  is  called  strong  fathoming. 

The  level  of  the  tree  is  thus  reduced  by  one  unit,  and  the  assignment  at  level  k - 1 is  prohi- 
bited. If,  on  the  other  hand,  the  bound  B’  is  less  than  C*,  a condition  referred  to  as  weak 
fathoming,  then  object  ik  is  assigned  to  some  other  unassigned  location.  This  is  discussed  in 
more  detail  in  the  forward  move  of  the  search.  The  cases  of  strong  and  weak  fathoming  are 
depicted  in  Figures  1 and  2. 

Progress  of  the  Search  (Forward  Move)  , . 

If  B < C*  then  we  must  choose  an  object  ^*+i  for  assignment.  For  example,  we  may 
choose  /*+i  to  be  an  unassigned  object  with  maximum  interactions  with  already  assigned 
objects,  or  choose  ik+l  to  be  an  unassigned  object  with  maximum  interactions  with  the  most 
recently  assigned  object  ik.  This  object  is  assigned  to  an  unassigned  location  /*+,  which  is  not 
prohibited.  This  location  can  be  chosen  in  such  a way  that  the  total  weighted  interaction 

with  assigned  objects  in  minimal.  For  example,  choose  /*+ 1 which  minimizes  £ u,  , d„>  + 

y-i 

£ u,  - d.  over  t 1 ysuch  that  x,  , - 1 is  not  prohibited. 

k*\j) 


EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  QUADRATIC-ASSIGNMENT  PROBLEM 


113 


Needless  to  say,  when  the  level  of  the  tree  is  m,  if  the  cost  is  less  than  C*.  then  C*  is 
updated  and  the  corresponding  assignment  is  stored. 

Termination 

We  have  described  forward  and  backward  progress  of  the  sarch  tree.  If  the  level  of  the 
tree  ever  reaches  value  zero,  then  we  stop.  This  would  mean  that  we  are  currently  at  level  one, 

an*,  are  trying  to  backtrack.  This  means  that  all  possible  assignments  under  x . - 1 and  x, 

Vi  Vi 

“ 0 have  already  been  enumerated  so  that  all  possible  ways  of  assignment  of  the  m objects  are 
enumerated,  and  we  stop.  The  stored  assignment  and  corresponding  C*  give  the  optimal  solu- 
tion. 

Snmmary  of  the  Algorithm 

We  have  discussed  above  all  the  details  required  to  describe  the  following  solution  pro- 
cedure of  the  quadratic  assignment  problem: 


114 


M S BAZARAA  AND  A.  N ELSHAFEI 


INITIALIZATION  STEP:  Let  the  prohibited  locations  for  object  / be  P(i)  — # for 
/ - 1,  ....  m,  and  let  C*  — °o.  Choose  an  object  ix  and  place  it  in  location  /,*.  We  may  deter- 

m 

mine  /,  by  maximizing  £ (u„  + uh)  for  / - 1 m,  and  we  can  determine  i\  by  minimizing 

n /“l 

£ (4j  + dj,)  for  j - 1,  ....  n.  Let  / - {/,}  and  J - {/,*}.  Let  k - 1,  and  go  to  Step  1. 

STEP  1 (Forward  Move):  Calculate  a lower  bound  B on  all  completions  of  the  current 

partial  solution.  Here  B - C,  + C2  + C3,  where  C,,  C2,  and  C,  are  calculated  as  discussed 
above  with  the  exception  that  bu  - «>  if  J € /»(/).  If  B > C\  go  to  Step  2.  Otherwise  pick 
/;*.  i 1 such  that  Ui  . - maximum  [ulu  + u,  ,)  and  place  ik+x  in  /*+1,  where  we  determine 

"Ti  x+l  k ifl  * * • 

'*+ 1 by  k k 

minimizing  /w  + £ ^,;+I  «Vk+, 

; i j 

y * Pit** i) 

Replace  / by 

/ U {/*+,)  and  / by  / U {/*+,}. 

If  k - m - 1,  then  Cj  is  the  cost  of  the  complete  assignment.  If  Cx  < C\  replace  C*by 

Ci,  store  (/„/,*)  for  t - 1 m,  replace  k by  m,  and  go  to  Step  2.  If  C,  > C*,  then  replace 

k by  m and  go  to  Step  2.  If  k < m - 1,  then  replace  k by  k + 1 and  repeat  Step  1. 

STEP  2 (Fathoming):  Here  ik  is  removed  from  / and  ik  is  removed  from  J.  Replace 

/*(/*)  by  P(ik)  U {/*  }.  Calculate  a lower  bound  Bon  all  completions  of  the  partial  solution  x/(. 

- 1 for  / - 1 * - 1 and  x.  - 0,  where  B - C,  + C2  + C3,  and  b„  - ~ if  j € P(/). 

‘k'k  . 

If  B > C#,  go  to  Step  3.  Otherwise,  assign  /*  to  a location  / « / U Pi/*)  such  that  the  cost 

k-\  k- 1 

f, , + T u,  , d . + T u. . d,,  is  minimized  over  t i J U P (/*).  Then  ik  is  added  to  /,  and 
j.x  y “> ' £ *' 

t,  the  new  ik,  is  added  to  J.  Go  to  Step  1. 

STEP  3 (Strong  Fathoming):  Here  /*_,  is  deleted  from  / and  /*_,  is  deleted  from  /.  Then 
/**_,  is  placed  in  />(/*_,),  P(ik)  is  replaced  by  the  empty  set,  and  k is  replaced  by  k - 1.  If 
k - 0 go  to  Step  4,  otherwise  go  to  Step  2. 

STEP  4 (Termination):  The  search  of  the  decision  tree  has  been  completed.  The  optimal 

cost  is  C*,  and  its  corresponding  assignment  (ik. £),  k - l m,  is  the  optimal  assignment. 

Stop. 

Note  that  during  the  initialization  step  an  upper  bound  C*  - <*>  is  used.  As  the  search 
progresses,  C*  denotes  the  objective  value  of  the  best  available  complete  assignment.  Further, 
P(l ) represents  the  locations  that  object  i cannot  be  assigned  to.  These  are  initialized  by  the 
empty  sets.  Step  1 represents  a forward  step,  where  the  level  of  the  tree  increases  by  one  unit. 
In  this  case  the  bound  is  less  than  C*,  hence  a complete  solution  with  an  objective  better  than 
,C*  is  possible.  Step  2 is  a fathoming  step,  where  B > C*.  In  this  case  the  last  assignment  is 
prohibited.  Immediately,  a new  bound  is  calculated.  If  the  new  bound  is  less  than  C*,  then  a 
forward  move  is  made.  But  if  the  bound  is  still  greater  than  or  equal  to  C*,  then  a strong 
fathoming  is  made  at  Step  3,  and  the  level  of  the  tree  is  reduced.  Of  course,  strong  fathoming 
is  most  desirable,  since  it  avoids  the  expensive  task  of  trying  to  locate  object  lk  in  a free  loca- 
tion other  than  /%. 


EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  QUADRATIC-ASSIGNMENT  PROBLEM 


115 


Adding  New  Facilities  to  an  Exiattag  Layout 

In  many  applications  a large  number  of  facilities  are  already  preassigned,  and  only  some 
new  facilities  are  to  be  placed  in  such  a way  that  the  overall  cost  is  minimized.  In  this  case,  the 
above  algorithm  can  be  applied  with  a few  obvious  modifications  in  the  calculations.  Since  the 
preassigned  objects  and  their  locations  will  remain  unchanged,  these  objects  will  always  be  in 
the  set  / and  their  locations  will  always  be  in  the  set  J.  In  the  search  tree,  if  y objects  are 
already  assigned,  we  start  the  search  by  assigning  more  objects,  that  is,  the  level  of  the  tree 
starts  at  y + 1.  If  the  level  of  the  tree  ever  becomes  y,  then  we  stop. 

3.  SUBOPTIMAL  AND  OPTIMAL  SOLUTIONS  BY  STEPPED  FATHOMING 

Due  to  the  highly  combinatorial  nature  of  the  problem,  the  task  of  finding  an  optimal 
solution  and  then  verifying  its  optimality  within  a reasonable  computational  time  is  almost 
impossible  in  the  case  of  large  problems.  Here  we  must  resort  to  suboptimal  solutions.  The 
branch-and-bound  procedure  itself  can  be  used  to  obtain  qualified  suboptimal  solutions.  In  111, 
Bazaraa  and  Elshafei  proposed  two  stepped  fathoming  methods  for  obtaining  controlled  subop- 
timal solutions  in  the  context  of  branch  and  bound.  The  application  of  these  methods  for  the 
quadratic-assignment  problem  is  discussed  in  this  section. 

Method  1 

Recall  that  a partial  solution  is  fathomed  if  the  lower  bound  on  all  its  completions  is  at 
least  as  big  as  C*  > 0,  the  best  known  objective  value.  Suppose  that  a partial  solution  is 
fathomed  if  B ^ aC *,  where  a € (0,U.  In  this  case,  the  partial  solution  is  abandoned  if  there 
is  no  hope  that  it  will  lead  to  an  objective  which  is  better  than  aC*.  The  purpose  of  this  simple 
strategy  is  clear.  We  want  to  fathom  the  partial  solution  quickly  even  if  it  might  lead  to  a slight 
improvement.  Of  course,  as  a new  C*  is  found,  then  we  fathom  whenever  the  bound  is  greater 
than  or  equal  to  a times  the  new  C*.  The  procedure  continues  until  we  cannot  find  a feasible 
solution  with  an  objective  leu  than  aC*.  Thus,  we  have  a feasible  assignment  with  objective 
C* coupled  with  the  statement  that  the  optimal  objective  is  greater  than  or  equal  to  aC*. 

Choice  of  a:  Of  course,  if  a is  small,  then  fathoming  will  speed  up  considerably,  resulting  in  a 
small  computational  effort.  But  on  the  other  hand,  the  quality  of  the  best  feasible  solution  is 
not  satisfactory.  We  recommend  values  of  a > 0.9,  depending  on  the  accuracy  required. 

Method  2 

At  each  stage  of  the  algorithm,  we  have  an  upper  bound  C*.  A lower  bound  on  the 
overall  problem  L can  be  devised.  Rather  than  fathoming  on  C*,  suppose  we  fathom  on 
K - aC • + (1  - a)L,  where  a € (0,lj.  Since  L < C\  then  K < C*.  Two  cases  are  poui- 
ble.  In  the  first  case,  we  will  be  able  to  find  a complete  solution  with  objective  leu  than  K. 
The  objective  value  of  this  new  solution  becomes  the  new  upper  bound  C*  and  the  proceu  is 
repeated.  In  the  second  case,  we  will  not  be  able  to  find  such  a solution.  This  automatically 
implies  that  there  are  no  solutions  with  objective  leu  than  K and  hence  K itself  is  the  new 
lower  bound.  The  proceu  is  repeated.  From  this  we  keep  narrowing  the  gap  between  the 
lower  and  upper  bounds,  either  by  lowering  the  upper  bound  when  an  improved  feuible  assign- 
ment is  found  or  by  raising  the  lower  bound  when  no  feuible  solution  with  objective  leu  than 
K is  found.  When  the  difference  between  the  lower  and  upper  bounds  is  smaller  than  a 
prescribed  tolerance  we  stop. 

Choice  of  a:  Here  a is  any  number  in  the  interval  (0,11.  Of  course,  if  a is  clow  to  1,  then  we 
are  in  effect  fathoming  on  a number  very  clow  to  C*.  and  the  search  will  not  speed  up  consid- 
erably. On  the  other  hand,  if  a is  clow  to  zero,  then  improvement  is  whieved  only  if  we 


116 


M.  S.  BAZARAA  AND  A.  N.  ELSHAFEI 


obtain  a feasible  solution  very  close  to  the  overall  lower  bound.  In  this  case,  fathoming  will  be 
fast,  but  it  is  likely  not  to  obtain  feasible  solutions  with  an  objective  value  that  is  less  than  K.  If 
a - 0.5  then  the  interval  of  uncertainty  in  which  the  optimal  objective  value  lies  will  be  halved 
at  each  stage. 


Calculation  of  the  Initial  Overall  Lower  and  Upper  Bounds 

Initial  lower  and  upper  bounds  are  needed  to  implement  the  above  fathoming  scheme.  To 
calculate  the  lower  bound,  first  calculate  a lower  bound  6^  on  the  cost  of  locating  object  i to 
location  J,  as  discussed  in  Section  2.  Then  a linear-assignment  problem  is  solved  to  find  L. 
More  precisely,  let  L be  the  optimal  objective  value  of  the  following  problem: 


m m 


Minimize 

£ £ My 

/-I  /-I 

m 

subject  to 

£*«“ 1 
y- 1 

m 

for  / — 1 , ....  m, 

£x-y“  1 
<- i 

for  J - 1,  ....  m, 

x,j  > 0 

for  /',  J - 1 , .... 

We  can  obtain  an  upper  bound  C*  immediately  by  calculating  the  quadratic  cost  of  the  optimal 
assignment  resulting  from  the  above  problem.  Now  Method  2 can  be  initiated. 

Exact  Solution  by  Stepped  Fathoming 

Either  of  the  above  two  methods  could  be  slightly  modified  to  provide  optimal  solutions 
and  still  reduce  the  portion  of  the  decision  tree  explicitly  enumerated.  Suppose  that  with  any 
given  a the  search  terminates  with  the  conclusion  that  there  exsits  no  feasible  solution  with  a 
quadratic  objective  value  less  than  K,  where  K - aC*  for  the  first  method  and  K — aC*  + 
(1  - a)  L for  the  second  method.  The  search  can  then  be  repeated  from  the  complete  stored 
solution  whose  objective  value  is  C*with  a larger  value  of  a. 

Several  increasing  values  of  o,  with  the  last  value  equal  to  one,  could  be  used.  Obvi- 
ously, for  a - 1 we  would  fathom  if  the  objective  value  is  at  least  equal  to  C*.  and  the  method 
will  produce  an  optimal  solution.  Even  though  portions  of  the  decision  tree  may  be  repeated, 
the  quick  fathoming  would  usually  result  in  a reduction  of  the  overall  computational  effort.  For 
further  details,  the  reader  is  referred  to  [I). 

4.  COMPUTATIONAL  EXPERIENCE 

The  experience  gained  with  the  solution  procedure  was  in  relation  to  the  problems 
reported  by  Nugent,  Vollman,  and  Ruml  113).  First,  we  will  discuss  some  details  pertinent  to 
the  purely  computational  aspects  of  the  procedure. 

Choice  of  a 

If  a small  value  of  a is  chosen,  the  fictitious  upper  bounds  aC* and  aC*  + (1  - *)L  tend 
to  be  tighter,  and  hence  the  search  becomes  faster.  However,  we  may  increase  the  number  of 
times  we  restart  the  search  from  the  current  best  complete  solution  with  a smaller  value  of  «. 
On  the  other  hand,  if  a is  large  then  fathoming  becomes  weaker,  and  the  search  tends  to  be 
lengthier  but  to  have  fewer  rr'iarts.  The  tradeoff  is  only  computational  and  is  data  dependent. 


EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  QUADRATIC-ASSIGNMENT  PROBLEM 


117 


T 


/ 


During  the  course  of  our  study,  we  noticed  that  due  to  the  features  of  the  search  pro- 
cedure many  successive  good  solutions  are  obtained  very  early  in  the  search,  and  the  optimum 
follows  suit.  As  a result,  we  have  adopted  the  strategy  of  choosing  a small  value  of  o at  the 
beginning  of  the  search  and  at  a certain  stage  switching  to  a - 1.  We  accomplish  this  by  speci- 
fying an  initial  value  of  a and  also  specifying  a difference  between  the  actual  upper  and  lower 
bounds  at  the  achievement  of  which  a is  switched  to  1.  If  the  difference  is  appropriately 
chosen,  we  will  get  to  the  stage  where  the  upper  bound  is  tight  enough  to  speed  up  the  search, 
and  also  we  will  not  have  to  restart  the  search  once  the  tree  is  enumerated,  as  there  is  no  inter- 
val of  uncertainty  in  this  case.  We  found  that  the  choice  of  a - 0.7  as  a starting  value  was 
adequate  for  all  the  problems  solved.  The  difference  between  the  two  bounds  at  which  we 
switched  to  a — I varied  from  one  problem  to  another. 

Solving  the  Linear  Assignment  Problem 

As  was  mentioned  in  Section  2,  the  lower  bound  can  be  calculated  by  various  methods. 
One  procedure  involves  the  use  of  a linear-assignment  problem  to  obtain  a tighter  lower  bound. 
There  is  no  need,  however,  to  solve  a fresh  assignment  problem  each  time  a lower  bound  is  to 
be  calculated.  It  is  possible  to  take  any  previous  solution  and  update  it  according  to  the  new 
cost  matrix. 

Tables  1 and  2 summarize  the  experience  with  the  following  two  codes: 

QAP3:  A code  for  an  algorithm  based  on  calculation  of  the  lower  bound  Ct  + C2  + C3 
by  the  matching  of  the  ordered  interaction  and  distance  vectors  as  discussed  in  Section  2. 

QAP7:  A code  for  an  algorithm  based  on  calculation  of  the  lower  bound  C{  + C2  + C 3 
by  the  solution  of  a linear  assignment  problem.  Here  the  cost  matrix  is  first  reduced  so  that  it 
has  a zero  in  every  row  and  every  olumn.  If  C*  is  greater  than  or  equal  to  the  bound,  we 
fathom.  Otherwise,  the  complete  linear-assignment  problem  is  solved  in  the  hope  that  the 
lower  bound  can  be  tightened. 

Total  Number  of  Nodes  : the  number  of  nodes,  both  intermediate  and  terminal,  generated 
during  the  search. 

Total  Number  of  Moves:  the  number  of  forward  and  backward  moves  conducted  during 
the  search. 

Number  of  times  it  *ww  necessary  to  solve  an  assignment  problem : whenever  the  lower  bound 
calculated  at  any  particular  node  by  the  reduction  method  was  less  than  the  current  upper 
bound,  it  was  necessary  to  solve  a linear  assignment  problem  to  improve  the  value  of  this  lower 
bound.  Naturally,  this  strategy  is  applicable  only  to  QAP7. 

Fathoming  Efficiency:  the  ratio  between  the  number  of  times  the  search  was  not  pursued, 
as  a result  of  the  lower-bound  test,  to  the  total  number  of  times  the  lower-bound  test  was 
applied. 

Comparison  of  QAP3  and  QAPT.  We  recall  that  the  only  difference  between  QAP3  and  QAP7  is 
the  method  of  calculation  of  the  lover  bound  as  shown  in  Section  2.  In  Table  1,  we  notice  that 
the  solution  times  when  QAP7  was  used  were  always  ten  than  those  obtained  when  QAP3  was 
used. 


I 

{ 


i 


| 

] 


I 


120 


M.  S.  BAZAR  A A AND  A.  N.  ELSHAFE1 


Our  observation  about  QAP3  is  that  it  can  face  severe  difficulties  when  the  problem  size 
increases.  For  example,  problem  4005  was  rerun  with  a starting  upper  bound  equal  to  the  true 
optimal  objective  value  of  # ■ 1 in  the  hope  that  this  would  speed  up  the  search.  However, 
we  had  to  terminate  the  problem  after  50,400  moves,  as  we  noticed  that  11,539  nodes  were 
generated  but  the  number  of  nodes  at  various  levels  were: 

0 4 41  358  2411  4101  3715  1000  184  8 3 3. 

Thus  a substantial  part  of  the  tree  was  still  to  be  searched,  and  the  estimated  time  for  the  com- 
pletion of  the  search  was  about  15  minutes  on  the  IBM  370/165.  Also,  the  experience  with 
4006  was  not  more  encouraging.  Note  that  both  QAP3  and  QAP7  found  the  optimal  solutions 
of  problems  4001  through  4004  and  verified  optimality.  QAP3  and  QAP7  found  the  optimal 
solution  of  problem  4005,  but  only  QAP7  verified  optimality. 

We  might  also  add  that  the  concept  of  stepped  fathoming  was  essential  in  the  procedure. 
QAP3  and  QAP7  were  not  able  to  find  optimal  solutions  to  some  of  the  reported  problems 
when  a — 1 was  used  from  the  start  of  the  search. 

ACKNOWLEDGMENTS 

The  authors  would  like  to  thank  Dr.  L.  F.  McGinnis  of  the  Georgia  Institute  of  Technol- 
ogy for  providing  them  with  a code  for  the  solution  of  the  linear-assignment  problem. 

REFERENCES 

[1]  Bazaraa,  M.  S.,  and  A.  N.  Elshafei,  "On  the  Use  of  Fictitious  Bounds  in  Tree  Search  Algo- 

rithms," Management  Science  23,  904-908  (1977). 

[2]  Breuer,  M.  A.,  The  Formulation  of  Some  Allocation  and  Connection  Problems  as  Integer 

Programs,"  Naval  Research  Logistics  Quarterly  13,  83-95  (1966). 

[3]  Dorris,  A.  L.,  The  Utility  of  Optimization  Techniques  in  the  Design  of  Man-Machine 

Systems,"  Masters  Thesis,  Georgia  Institute  of  Technology,  Atlanta,  Georgia  (1971). 

[4]  Elshafei,  A.  N.,  "Hospital  Layout  as  a Quadratic  Assignment  Problem,"  Operational 

Research  Quarterly  28,  167-179  (1977). 

[5]  Gaschutz,  G.  K.,  and  J.  H.  Ahrens,  "Suboptimal  Algorithm  for  the  Quadratic  Assignment 

Problem,"  Naval  Research  Logistics  Quarterly  15,  49-62  (1968). 

[6J  Gavett,  J.  W.,  and  N.  V.  Plyter,  The  Optimal  Assignment  of  Facilities  to  Locations  by 
Branch  and  Bound,"  Operations  Research  14,  210-232  (1966). 

[7]  Gilmore,  P.  C.,  "Optimal  and  Suboptimal  Algorithms  for  the  Quadratic  Assignment  Prob- 

lem," SIAM  Journal  on  Applied  Mathematics  10,  305-313  (1962). 

[8]  Graves,  G.  W.,  and  A.  B.  Whinston,  "An  Algorithm  for  the  Quadratic  Assignment  Prob- 

lem," Management  Science  17,  453-471  (1970). 

19]  Hopkins,  L.  D.,  "Land-Use  Plan  Design— Quadratic  Assignment  and  Central  Facility 
Models,"  Environment  and  Planning  9,  625-642  (1977). 

[10]  Koopmans,  T.  C.,  and  M.  Beckman,  "Assignment  Problems  and  the  Location  of 

Economic  Activities,"  Econometrics  25,  53-76  (1957). 

[11]  Land,  A.  H.,  "A  Problem  of  Assignment  with  Interrelated  Costs,"  Operational  Research 

Quarterly  14,  185-198  (1963). 

[12]  Lawler,  E.  L.,  The  Quadratic  Assignment  Problem,"  Management  Science  9,  586-599 

(1963). 

[13]  Nugent,  C.  E.,  T.  E.  Vollman,  and  J.  Ruml,  "An  Experimental  Comparison  of  Techniques 

for  the  Assignment  of  Facilities  to  Locations,"  Operations  Research  16,  150-173  (1968). 


EXACT  BRANCH-AND-BOUND  PROCEDURE  FOR  QUADRATIC-ASSIGNMENT  PROBLEM 


121 


[14]  Pierce,  J.  F.,  and  W.  B.  Crowston,  Tree  Search  Algorithms  for  Quadratic  Assignment 

Problems,”  Naval  Research  Logistics  Quarterly  18,  1-36  (1971). 

[15]  Steinberg,  L.,  "The  Backboard  Wiring  Problem:  A Placement  Algorithm,”  SIAM  Journal 

on  Applied  Mathematics  3,  37-50  (1961). 

[16]  Whitehead,  B.,  and  M.  Z.  Elders,  ”An  Approach  to  the  Optimum  Layout  of  Single  Story 

Buildings,”  Architect’s  Journal  139,  1373-1380  (1964). 


THE  ENUMERATION  OF  ALL  EFFICIENT  SOLUTIONS 
FOR  A LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION  PROBLEM 

Heinz  Isermann 

University  of  Bielefeld 
Federal  Republic  of  Germany 

ABSTRACT 

An  algorithm  is  presented  by  which  the  set  of  all  efficient  solutions  for  a 
linear  multiple-objective  transportation  problem  can  be  enumerated.  First  the 
algorithm  determines  an  initial  efficient  basic  solution.  In  a second  step  all 
efficient  basic  solutions  are  enumerated.  Finally,  the  set  of  all  efficient  solu- 
tions is  constructed  as  a union  of  a minimal  number  of  convex  sets  of  efficient 
solutions.  The  algorithm  is  illustrated  by  a numerical  example. 


1.  INTRODUCTION 

The  classical  transportation  problem  is  a linear  programming  problem  in  which  the  con- 
straints exhibit  a particular  type  of  mathematical  structure.  The  usual  scenario  of  the 
transportation-type  constraints  runs  like  this:  At  each  of  the  m origins  O , there  is  a quantity  a , 
of  a commodity  at  our  disposal  which  we  wish  to  ship  to  n destinations  Dt  to  satisfy  the 
demands  bj  there.  The  symbol  x0  represents  the  unknown  quantity  shipped  from  O,  to  Dj. 

Instead  of  considering  one  scalar-valued  objective  function,  this  paper  will  focus  on  tran- 
sportation problems  with  k > 1 linear  objective  functions  to  take  care  of  those  planning  prob- 
lems of  economic  origin  which  exhibit  the  mathematical  structure  of  a transportation  problem 
but  are  characterized  by  the  existence  of  several  objective  functions.  In  most  cases  these  func- 
tions are  measured  on  different  scales  and  have  different  units,  and  in  general  the  decision 
maker  is  unable  to  combine  these  objective  functions  into  one  overall  utility  function.  Let  cjj 
represent  the  proportional  contribution  to  the  value  of  the  / th  objective  function  of  shipping 
one  unit  of  the  commodity  from  O,  to  Dj.  If  the  decision  maker  wants  to  minimize  the  k 
objective  functions  simultaneously,  he  will  generally  come  to  some  point  where  a further  reduc- 
tion of  the  value  of  any  objective  function  may  only  be  obtained  at  the  expense  of  increasing 
the  value  of  at  least  one  other  objective  function.  In  other  words,  at  this  point  at  least  two  of 
the  k objective  functions  considered  are  in  conflict. 

Before  going  further,  for  convenience  let  us  introduce  the  following  notation.  Let  R 
denote  the  set  of  the  real  numbers,  R0  the  set  of  the  nonnegative  real  numbers,  and  R+  the  set 
of  the  positive  real  numbers.  With  regard  to  vector  inequalities,  the  following  convention  will 
be  applied:  a£  b if  and  only  if  at  £ bt  for  all  / - 1 , . . . ,n\  a > b if  and  only  if  a , J>  b,  for  all 

/ — 1.  and  a,  > b,  for  at  least  one  / ; a > b if  and  only  if  a,  > b,  for  all  / - 1 n. 

The  transpose  of  a vector  or  a matrix  will  be  denoted  by  an  upper  index  (superscript)  T. 


124 


H.  ISERMANN 


The  multiple-objective  transportation  problem  is  the  problem  of  minimizing  the  k scalar- 
valued objective  functions  considered  except  for  the  conflicts  amo.  ‘hem.  It  may  be  stated  as 

min  2|  - I I cu  xu- 

/-I  7-1 


min  z2  — II 

<-l  7-1 


min 


/-I  7-1 


subject  to 


I x,j  - 

J - 1 

m 

/-l.  .. 

I*.7-*7. 

/-I 

J-  1.  .. 

xu  Z o. 

/-I,  ... 

,m\J  - 1.  .. 

or  setting  M - {1 m),  N - (1 n ).  J - {(/,  j)  \ i € M,  J € /V),  as  the  problem 


(1) 


"mfliT 


* “ I C<7  XIJ 
(l.J)tJ 


I xu  “ 

for  all  i € Af, 

ytN 

I XU  “ *7 

for  all  J € N, 

It  At 

X 

IV 

O 

for  all  (ij)  €/, 

where  z € /I*  and  c,y  - c,J.  ...  ,c^)T.  We  shall  assume  throughout  this  paper  a{  > 0,  bj  > 0, 
and  £ a,  - I Ay.  The  set  of  all  feasible  solutions  for  (1)  will  be  denoted  by  S.  Let 

itM  ,tN 

x “ (jfft.  xn<  • • • 'xmJ  be  a feasible  solution  for  (1).  The  solution  x°  is  said  to  be  an  efficient 
or  nondominated  solution  for  (1)  if  and  only  if  there  is  no  other  feasible  solution  x'  for  (1) 
such  that  z'  - £ c0  xtJ  < J Cijxfi  - z°.  In  (1)  the  operator  "min"  indicates  that  all 

(47)  *7  (IJ)tJ 

efficient  solutions  for  (1)  are  to  be  determined.  The  set  of  all  efficient  solutions  for  (1)  will  be 
denoted  by  S°.  In  general  5°  is  not  a convex  set. 

Thus,  by  solving  the  multiple-objective  transportation  problem  (1)  we  end  up  with  a sub- 
set of  feasible  solutions  from  among  which  we  can  be  sure  a most-preferred  solution  might  lie, 
but  between  which,  in  the  light  of  what  is  known  about  the  decision  maker’s  preference  system 
at  this  point,  no  further  discrimination  is  possible.  Except  for  the  case  in  which  an  efficient 
solution  is  minimal  with  respect  to  each  scalar-valued  objective  function  — which  indicates  that 
the  objective  functions  are  not  in  conflict  — further  information  on  the  decision  maker’s 
preference  system  is  necessary  in  order  to  select  a most-preferred  solution  from  the  set  of 


m 

I; 

I 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


125 


efficient  solutions.  Though  the  algorithm  w<_  „re  going  to  present  in  this  paper  will  be  keyed  to 
the  determination  of  the  set  of  all  efficient  solutions  for  the  multiple-objective  transportation 
problem,  it  is  well  suited  to  help  the  decision  maker  identify  the  relevant  areas  of  efficient  solu- 
tions and  arrive  at  a final  solution  at  which  all  other  relevant  efficient  solutions  have  been  con- 
sidered and  rejected. 

In  view  of  Theorem  1 in  13],  a multiparametric  programming  routine  could  be  applied  to 
the  multiple-objective  transportation  problem  (1).  A parametric-programming  approach  has 
been  applied  to  bicriterion  problems  in  [2]  and  [8].  As  will  be  seen  later,  the  proposed  algo- 
rithm may  be  regarded  as  a dual  method  for  parametric  programming  in  that  it  identifies 
efficient  solutions  for  (1)  by  an  approach  which  is  dual  to  that  in  multiparametric  programming. 

We  shall  distinguish  three  phases  of  the  algorithm.  In  Phase  I an  initial  efficient  basic 
solution  for  the  multiple-objective  transportation  problem  (1)  is  found.  Provided  that  the  ini- 
tial efficient  basic  solution  is  not  unique,  each  of  the  other  efficient  basic  solutions  for  (!)  will 
be  enumerated  in  Phase  II.  Finally,  in  Phase  III  the  set  of  all  efficient  solutions  for  (1)  is  esta- 
blished as  a union  of  a minimal  number  of  convex  sets  of  efficient  solutions. 

To  solve  the  multiple-objective  transportation  problem,  we  shall  utilize  some  duality 
results  for  multiple-objective  linear  programs  and  some  fundamental  results  for  the  system  of 
constraints  in  (1)  which  are  the  same  as  those  need  in  ordinary  transportation  problems.  Let  A 

denote  the  matrix  of  coefficients  of  the  linear  constraints  in  (1),  i.e.,  A - (a,, am„), 

where  o,y  - e,  + em+y  and  e,  €Rm+"  is  a unit  vector  whose  rth  component  is  +1. 

THEOREM  1:  The  matrix  A is  of  rank  (m  + n - 1). 

THEOREM  2:  The  matrix  A is  totally  unimodular,  i.e.,  every  square  submatrix  of  A has 
its  determinant  equal  to  zero,  + 1,  or  -1. 

Recall  that  a square  matrix  is  said  to  be  triangular  if,  after  suitable  rearrangement  of  rows 
and  columns,  all  coefficients  below  the  principal  diagonal  are  zero. 

THEOREM  3:  Every  basis  of  A is  triangular. 

A proof  of  the  above  theorems  is  found  in  [1]  and  [7], 

A dual  problem  to  (1)  is  the  following  (see  14]  for  details): 

(2)  ’max?  {*  - £ + £ vjbj  £ («,  + vy  - c/y)w,y  > 0 

j KM  JtN  (I.J)tJ 

for  no  w - (w,, wm„)  € Rg  m 


where  ty,vy  € Rk  are  Ac-dimensional  dual  variables.  The  following  duality  result  will  be  applied 
14]: 


THEOREM  4:  Let  x°  - (jcPi x°„)  be  a feasible  solution  for  (1).  The  solution  x°  is 

an  efficient  solution  for  (1)  if  and  only  if  there  exists  a feasible  solution  u,°  (/  — 1 m), 

vy°  0 - 1 , ....  n)  for  the  dual  problem  such  that  £ c/yx^  - £ u?a,  + £ vy°6y.  The 

U.J)tJ  KM  JtN 

solution  u,  (i  - 1 m),  vy  (j  — 1,  ....  n)  is  then  itself  an  efficient  solution  for  (2). 


'1 


126  ’ " • H.  ISERMANN 


Since  in  (1)  the  system  of  constraints  has  the  property  that  every  basis  is  triangular,  a 
feasible  basic  solution  for  (1)  is  easily  obtained. 

Let  x®  — (xp1(  . . . , x%„)  be  a feasible  basic  solution  for  ( 1 ) , and  let  J°  denote  the  index  set 
of  the  basic  variables  in  x°,  i.e.,  J°  -{(/,  J)  € 7 1 x(j  is  a basic  variable  of  the  basic  solution  x°). 
Now  consider  the  system 

(3)  c0  - u,  + v j for  all  (/,  j)  €7°. 

Since  (3)  is  a system  of  n + m - 1 vector  equations  in  n + m vector-valued  variables,  we  can 
assign  an  arbitrary  value  to  one  of  the  m + n vector-valued  simplex  multipliers  and  then  evalu- 
ate the  remaining  n + m - 1 multipliers  thereby  rendered  unique.  We  shall  assign  a zero  vec- 
tor to  U|  and  then  evaluate  the  remaining  14  (/  - 2 m)  and  \j(.J  - 1,  ....  n).  Let 

«,°(/  - 1,  ....  m),  vy°(7  - 1,  ....  n)  be  the  solution  for  the  system  (3).  After  multiplying 
the  / th  row  equation  in  (1)  by  u,°  and  the  J th  column  equation  by  v;°  and  summing  up  over  i 
and  J , respectively,  we  obtain 

E L«/%“  5>/V.  E Evv%-  Ev°*y 

/€«  JtN  i € M JCN  itM  JtN 

Let  z°  - E c,jx!j  - E M/V  + E vJ°bJ-  d3  “ cu  ~ ~ VJ°  for  8,1  ^ V)  €/ 

(/.y)«y  /««  yf/v 

The  the  vector- valued  objective  function  can  be  written  as 

Z — Z°  + E dUXiJ‘ 

(IJ)tJ 

Recall  that  by  ( 3 ) 7®  - 0 if  x,j  is  a basic  variable  of  x®. 

THEOREM  5:  Let  x®  - (xft,  ....  x®,)7"  be  a feasible  basic  solution  for  (1).  If  the  sys- 
tem 

(4)  E d3wu  < °*  wu  6«o  for  0.7)  €/, 

O.J)tJ 

has  no  solution  w - (wn,  . . . , wm„,  then  x®  is  an  efficient  solution  for  (1). 

PROOF:  Let  the  system  (4)  have  no  solution,  which  means  that  the  m + n vector-valued 
simplex  multipliers  u®,  vj  are  a feasible  solution  for  the  dual  problem  (2).  Then  there  exists 
no  x€  5 such  that  £ 4°  *j  < 0 holds.  This  implies,  in  connection  with 

(«)« J 

E duxu  “ E cuxu  ~ E E «/%  'EE  vy*/y 

Uy)€7  </,y)«y  ItUJtN  ( JtN  itM 

- E c/y*/y  - E Cijx3' 

iu)tJ  uj)tJ 

E C/jXu  < E c>jX° for  no  x€S  and,  hence, x®  € S®. 
ay)«y  </,y)«/ 

The  converse  of  Theorem  3 will  hold  only  for  nondegenerate  x®. 

THEOREM  6:  Let  x°-  (xp(,  ....  x®„)  be  an  efficient  nondegenerate  basic  solution  for 
the  multiple-objective  transportation  problem.  Then  the  system  (4)  has  no  solution  w. 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


127 


PROOF:  Since  x°  is  nondegenerate,  a unique  basis  and  a unique  set  of  reduced  criterion 
vectors  dfj\(ij)  €7]  can  be  associated  to  x°.  Moreover,  as  x°  is  also  efficient,  according  to 
Theorem  1 in  [ 3 ) there  exists  a A0  €/?{.  such  that  (A0)  T dfe  0 holds  for  all  (/,  J)  €7.  This 
implies,  by  Motzkin’s  theorem  of  the  alternative  ([  6 ],  p.  28),  that  the  system  (4)  has  no  solu- 
tion w. 

An  efficient  basic  solution  for  the  multiple-objective  transportation  problem  (1)  is  said  to 
be  dual  feasible  if  and  only  if  the  system  (4)  has  no  solution  tv.  Dual  feasibility  is  a sufficient 
condition  for  the  efficiency  of  a feasible  basic  solution  for  (1);  it  is,  in  addition  to  this,  neces- 
sary for  the  efficiency  of  a nondegenerate  feasible  basic  solution  for  (1).  A degenerate  feasible 
basic  solution  may  be  efficient  without  being  dual  feasible.  However,  Theorem  4 implies  that 
from  the  set  of  all  degenerate  efficient  basic  solutions  which  have  one  efficient  extreme  point  of 
5 in  common  at  least  one  degenerate  efficient  basic  solution  is  dual  feasible.  Moreover,  in  view 
of  Theorem  4,  the  determination  of  all  dual  feasible  efficient  solutions  for  (1)  is  adequate  to 
construct  the  set  5°.  For  this  reason,  the  proposed  solution  procedure  will  concentrate  on 
determining  all  dual  feasible  efficient  solutions  for  the  multiple-objective  transportation  problem 
(1). 

Recall  that  the  matrix  A of  the  multiple-objective  transportation  problem  (1)  is  totally 
unimodular.  If  the  coefficients  ah  for  all  i € M,  and  bj,  for  all  J €N,  are  integers,  the  values  of 
the  variables  are  integers  in  every  feasible  basic  solution  and,  hence,  in  every  efficient  basic 
solution. 

An  alternate  problem  representation  for  the  multiple-objective  transportation  problem  (i) 
is  the  multiple-objective  transportation  tableau.  The  multiple-objective  transportation  tableau 
is,  by  definition,  a rectangular  tableau  having  m rows  corresponding  to  the  origins  O,  and  n 
columns  corresponding  to  the  destinations  Dj.  Each  square  (/,  J)  at  the  intersection  of  row  / 
and  column  j contains  quantities  associated  to  the  variable  xtJ\  the  criterion  vector  ct),  the  value 
of  the  variable  xIJt  and  the  reduced  criterion  vector  di}.  Moreover,  the  transportation  tableau  is 
bordered  by  a zero  column  containing  the  availabilities  a„  a zero  row  containing  the  demands 
bj,  a marginal  column  containing  the  vector-valued  row  multipliers,  uh  arid  a marginal  row  con- 
taining the  vector-valued  column  multipliers,  vy. 

A numerical  example  of  a multiple-objective  transportation  problem  with  three  objective 
functions  (Figure  1)  will  be  employed  to  illustrate  the  solution  procedure.  We  shall  apply  a 
multiple-objective  transportation  tableau,  to  state  the  problem,  and  a starting  feasible  basic  solu- 
tion. 

From  this  tableau  we  can  observe  that  the  present  feasible  basic  solution  x°  is  not  dual  feasible, 
because  d$3  < 0. 

2.  PHASE  I:  DETERMINATION  OF  AN  INITIAL  EFFICIENT  BASIC  SOLUTION 

In  order  to  determine  a first  efficient  solution  for  (1)  we  shall  apply  the  following  result: 

THEOREM  7:  At  least  one  feasible  basic  solution  for  the  multiple  objective  transporta- 
tion problem  (1)  is  an  efficient  solution. 

PROOF:  Recall  that  the  set  of  feasible  solutions  for  (1)  is  nonempty  ([7],  p.  226)  and 
each  feasible  solution  x for  (1)  can  be  expressed  as  a convex  combination  of  the  finitely  many 
feasible  basic  solutions  for  (1)  ([51,  p.  145).  Let  X - lx1,  ... , x*)  be  the  set  of  sll  feasible 


128 


H.  ISERMANN 


ft,  - 60 


ft,  -80 


0j 

ft,  - 160 


O,  - a,  - 100 


"'(0 '"'(i)  «„-(♦)  | «,-0  «..-(»)[ 


*h  - ioo  ; 


O,  - Oj  - 125 


®!  ‘-®b  • S 


-®l 


4 -45  ! 


*h  - 80  : 


- 75 


“■■■  (!) 


vr-(1 


vi"U 


(!)  "-(i)  •-  (ii) 


Hut'ltE  I.  Multiple-objective  transportation  tableau  I 

basic  solutions  for  (1).  Consider  the  set  Z - (z1,  ....  z'),  where  z*-  £ t CyxJ  for  all 

/-i  y-i 

p - 1,  ... , r.  Let  z*  be  a lexicographically  minimal  vector  in  Z,  i.e.,  z*-  lexmin  (z1 zl. 

Then  x(  is  an  efficient  solution  for  (1)  since,  for  each  x*  € X,  z'<  z*  does  not  hold  and, 

because  of  the  linearity  of  the  vector-function  z for  each  x'  € 5,  z'  - It  e0  x0  < z*  does  not 

/-I  7-1 

hold  either. 

Applying  the  proof  of  Theorem  7,  we  can  determine  an  initial  efficient  basic  solution  for 
the  multiple-objective  transportation  problem  (1)  by  solving  the  problem 


x,y  a,  for  all  i € M, 

7-1 

1 m £ cij*v 

,for  all  J € A, 

UJHJ 

x,j  £ 0 for  all  (/J)  € J. 

EFFICIENT  SOLUTIONS  FQR  LINEAR  MULTIPLE-OBJECTiVE  TRANSPORTATION 


129 


Recall  that  a feasible  solution  for  (1)  which  is  a unique  optimal  solution  with  respect  to  the 
scalar-valued  objective  function  z,  is  an  optimal  basic  solution  for  (5)  and,  hence,  an  efficient 
basic  solution  for  (1).  However,  if  there  is  no  unique  optimal  basic  solution  with  respect  to  the 
scalar-valued  objective  function  zlt  some  of  these  optimal  basic  solutions  may  not  be  efficient 
solutions  for  (1).  By  the  lexicographic-minimum  problem  (S),  one  feasible  basic  solution 
which  is  optimal  with  respect  to  Z|  is  identified  that  is  an  efficient  basic  solution  for  (1). 

Consider  the  transportation  problem  (5)  and  let  x°  be  a feasible  basic  solution  for  (5)  and 
u,°  (/  - 1 m),  vfO  “ 1 n)  be  the  associated  simplex  multipliers.  If  all  relative  cri- 

terion vectors  </,?  are  lexicographically  greater  than  or  equal  to  the  zero  vector,  i.e., 
</,y  - c0  - «,°  - v;  > 0 for  all  (/,  J)  €/,  then  x°  is  an  optimal  solution  for  (5)  and,  hence,  the 
initial  efficient  solution  for  the  multiple-objective  transportation  problem  (1).  To  show  that  the 
initial  efficient  solution  for  (1)  thus  determined  is  also  dual  feasible,  let  us  assume,  to  the  con- 
trary, that  the  s>  uem  (4)  has  a solution  w°.  This  implies  that  the  zero  vector  is  lexicographi- 
cally greater  than  £ d°  w°  and,  hence,  also  lexicographically  greater  than  dfi  for  at  least  one 

pair  (ij)  €7,  which  is  a contradiction  of  the  optimality  criterion  applied  above. 

In  Phase  I of  the  multiple-objective  transportation  algorithm  we  have  the  following  steps: 

STEP  1:  Determine  an  initial  feasible  basic  solution  in  the  multiple-objective  transporta- 
tion tableau. 

STEP  2:  Designate  the  set  of  pairs  of  indices  (ij)  of  the  basic  variables  by  /,  and  solve 
the  system  u,  - 0,  u,  + vy  - c(/  for  all  (/,  J)  €/. 

STEP  3:  Compute  the  relative  criterion  vector  du  ■»  c/y  - u,  - vy  for  all  (IJ)  €7\I. 

STEP  4:  Select  da  - lexmin  [du  \ (I.  J ) €7\/|. 

STEP  5:  If  d„  > 0,  go  to  Step  7.  Otherwise  go  to  Step  6. 

STEP  6:  The  variable  x„  becomes  a basic  variable  of  the  new  feasible  basic  solution. 
Change  the  current  solution  to  a new  feasible  basic  solution  and  go  to  Step  2. 


STEP  7:  Designate  the  current  feasible  basic  solution  by  x'  and  the  corresponding  index 
set  /by  The  solution  x1  is  an  optimal  solution  for  (S)  and,  hence,  the  initial  efficient  basic 
solution  for  (1).  Store  x1,  z1-  £ c/yx,J,  and  J'. 


To  illustrate  Phase  I of  the  multiple-objective  transportation  algorithm,  we  shall  depart 
from  the  feasible  basic  solution  of  the  multiple-objective  transportation  tableau  1.  Since  Steps  2 

— 4) 

and  3 have  been  performed,  we  proceed  with  Step  4:  </2°)  “ -1 1 “ lexmin  [df]  \ (I,  J ) €/\I  )■ 

As  di j > 0 does  not  hold,  the  current  feasible  basic  solut  on  does  not  satisfy  the  sufficient 
optimality  criterion.  The  variable  xn  becomes  a basic  variable  of  the  new  feasible  basic  solu- 
tion, which  is  specified  in  multiple-object  transportation  tableau  2 (Figure  2). 


Since  di}  — 


lexmin  (d„  | (/,  J)  €7\/|  and  rf2i  > 0,  the  current  feasible  basic  solution  is 


optimal  for  (5)  and  is,  hence,  the  initial  efficient  basic  solution  for  (1).  We  shall  designate  this 
solution  by  x1  and  store  x1,  z1  and  71- {(1,3),  (2.2),  (2,3),  (3,1),  (3,3)}. 


130 


. f • 


H ISERMANN 


■ •. ■ 


L 


I 


I 


3.  PHASE  II:  CONSTRUCTING  THE  SET  OF  ALL  EFFICIENT  BASIC  SOLUTIONS 

At  the  beginning  of  Phase  II  a dual  feasible  efficient  basic  solution  x”  for  (1)  and  the 
associated  multiple-objective  transportation  tableau  are  available.  We  shall  now  determine  the 
remaining  efficient  basic  solutions  for  (1)  which  are  dual  feasible.  First  let  us  introduce  an 
adjacency  relation  between  efficient  basic  solutions. 


Let  x'  and  x * be  efficient  basic  solutions  for  (1).  Solutions  x'  and  x " are  said  to  be  adja- 
cent if  and  only  if  (i)  and  (ii)  hold: 

(i)  x'  and  x * have  o+m-2  basic  variables  in  common; 

(ii)  each  x°-ax'  + (1  -a)x*  (0  £ a £ 1)  is  an  efficient  solution  for  (1). 

Let  us  consider  an  efficient  basic  solutions  x*  for  the  multiple-objective  transportation 
problem  (1)  and  the  associated  multiple-objective  transportation  tableau.  In  order  to  determine 
all  dual  feasible  efficient  basic  solutions  for  (1)  which  are  adjacent  to  xa  we  construct  the  set 


9 


A 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


131 


P”  - Hi,  J)  € y \y<r| 0}.  Note  that  every  feasible  basic  solution  obtained  by  making  xtJ, 
( ij ) 4 Pa,  a basic  variable  is  not  dual  feasible  and,  if  xtj  has  been  increased  to  a positive  value, 
not  efficient.  Hence,  if  Pa  is  empty,  x”  is  the  unique  dual  feasible  efficient  solution  for  (1). 
However,  not  every  xtJ  with  O',  J)  € Pa  is  automatically  an  incoming  variable  of  an  adjacent 
efficient  basic  solution. 

THEOREM  8:  Let  x"  be  a dual  feasible  efficient  basic  solution  for  (1)  and  Qa  a 
nonempty  subset  of  P”.  If  the  linear  program 

X d,J  w,j+  Iy  - l, 

Uj)(Pa 

y *ij  Z 0 for  all  IJJ)  tP^Q*. 


(6) 


max 


iTy 


has  an  optimal  solution,  then  each  xy,  ( i,J)4Qa , is  an  incoming  variable  of  a dual  feasible 
efficient  basic  solution  for  (1)  which  is  adjacent  to  x”. 

PROOF:  Let  (6)  have  an  optimal  solution  wu  [(i,J)  4P'T),y.  Then  the  linear  program 
dual  to  (6), 


(7) 

min 

lrX 

(</,/)  rX  £ 0 for  all  0 J)  € P’,\QV. 
(d,f)  rX  - 0 for  all  OJ)  € Qa, 

X fcl. 

has  an  optimal  solution  X € R+.  Not  only  x”,  but  also  every  feasible  basic  solution  which  is 
obtained  from  xa  by  changing  xu  l( i.J ) 4Qa]  to  a basic  variable,  is  optimal  for 


X xtJ  — a,  for  all  / € M, 

JtN 

I ^CtjXij 

X x,j  - bj  for  all  J € N, 

Hj)tJ 

ItU 

x u 0 for  alK/,./)  € J, 

with  X - X.  Each  of  the  feasible  basic  solutions  thus  enumerated  is  efficient,  according  to 
Theorem  1 in  [3],  and  adjacent  to  xa.  Applying  Motzkin’s  theorem  of  the  alternative  (see, 
e.g.,  [61,  p.  28)  to  the  system  (dtJ)  £ 0 for  all  Oj)  4P",  we  observe  that  each  of  the 
efficient  basic  solutions  located  is  dual  feasible.  This  completes  the  proof. 

THEOREM  9:  Let  x"  be  a dual  feasible  efficient  basic  solution  for  (1)  and  x"  be  a dual 
feasible  efficient  basic  solution  for  (1)  which  is  adjacent  to  x*.  Let  (s,r)  € P”  denote  the  pair 
of  indices  of  the  incoming  variable  in  order  to  proceed  from  xa  to  x".  Then  there  exists  some 
q<t£  p, r w|th  (s,/)  € Qa  such  that  (6)  has  an  optimal  solution. 

PROOF:  The  adjacency  relation  between  x * and  x*  implies,  in  conjunction  with  Theorem 
•1  in  (3),  that  for  QP  - ((«,/))  (7)  has  a feasible  solution.  As  in  (7)  the  objective  function  is 
bounded  from  below  in  the  respective  feasible  set,  (7)  and  hence  also  (6)  have  an  optima)  solu- 
tion for  0"--  {(«,/)}.  This  completes  the  proof. 


132 


H.  ISERMANN 


Let  //denote  the  index  set  of  the  dual  feasible  efficient  basic  solutions  for  (1).  All 
xa((T  € H)  and  the  existing  adjacency  relations  can  be  represented  by  an  undirected  graph:  Let 
E - [(x"  \<r  € H\  and  L - {(xa,x“)  | x"  and  x'  are  adjacent  (tr.v  €//)}.  The  undirected  graph 
G - (E.L)  is  said  to  be  the  solution  graph  associated  with  the  multiple-objective  transportation 
problem  (1). 

Theorems  8 and  9 establish  the  theoretical  basis  for  identifying  all  x * (v  € H)  which  are 
adjacent  to  any  xa.  Moreover,  in  connection  with  a fundamental  property  of  the  solution  graph 
G , these  two  theorems  also  justify  the  proposed  procedure  to  determine  all  x^ia  €//). 

THEOREM  10:  The  solution  graph  G is  finite  and  connected. 

PROOF:  Since  the  number  of  feasible  basic  solutions  for  (1)  is  finite,  G is  obviously 
finite.  In  order  to  see  that  G is  connected,  consider  any  pair  (x^x")  with  <r,  v €//.  According 
to  Theorem  1 in  [3]  there  exists  a A"  € R\  (A"  €R  + ) such  that  x'Cx*')  is  optimal  for  (8)  with 
X - A"  (A  - A”).  By  solving  the  one-parametric  transportation  problem 


(9) 

min 

(aA"  + (1  — a)  A*3  r £ C,jX,. 

£ x,j  - a,  for  all  / € M, 
£ xtJ  - b i for  all  J € N, 

Oj)(J 

ItU 

x,j  £ 0 for  all  (ij)  € J, 

for  all  a,  1 £ a £ 0,  we  obtain  a sequence  of  efficient  basic  solutions  xa,  xa x"  where 

each  two  consecutive  efficient  basic  solutions  are  adjacent.  Hence  each  pair  (x'.x*')  of  efficient 
basic  solutions  for  (1)  is  linked  by  a chain  in  <7.  This  completes  the  proof. 

Consider  an  efficient  basic  solutions  x"  and  let  (6)  have  an  optimal  solution 
w0l(ij)  €/><r|,  y for  Q"  - {(s.r) }.  If  the  pair  of  indices  of  each  basic  variable  w0  with 
(ij)  ^ (s,t)  of  the  given  optimal  solution  for  (6)  is  included  in  Qa,  the  initial  optimal  solution 
for  (6)  still  remains  optimal.  In  the  course  of  the  algorithm  we  shall  enlarge  the  index  set  Q° 
in  the  proposed  way,  if  possible,  in  order  to  reduce  the  number  of  linear  programs  (6)  to  be 
solved.  Further  investigations  with  respect  to  the  index  set  Qa  are  necessary  if  we  do  not 
confine  ourselves  to  the  enumeration  of  all  dual  feasible  efficient  basic  solutions  for  (1)  but  of 
the  entire  set  S°.  We  have  to  examine;  whether  or  not  further  pairs  of  indices  (ij)  € Pa  can  be 
included  in  Qa  such  that  the  respective  linear  program  (6)  still  has  an  optimal  solution,  Let 
Q°  * Pa  have  less  than  q - min  {*,  | /><T|}  elements;  then  we  successively  examine  whether  or 
not  by  dropping  the  sign  restriction  of  a nonbasic  variable  w(j  of  the  current  optimal  solution  for 
(6)  this  variable  can  become  a basic  variable  of  an  optimal  solution  for  (6)  in  exchange  for  a 
basic  variable  yt.  If  at  the  end  of  this  procedure  Qa  - P",  Q " is  obviously  maximal.  At  the 
end  of  this  procedure  let  Q*  ^ Pa.  Then  the  pair  of  indices  (i,J)  of  each  nonbasic  variable  w0 
of  the  current  optimal  solution  for  (6)  for  which  the  reduced  cost  coefficient  in  the  simplex 
tableau  is  zero  has  to  be  included  in  Q " in  order  to  obtain  a maximal  index  set  Q”.  Since  the 
linear  program  (6)  on  which  the  described  analysis  is  conducted  has  only  k constraints,  the 
computation  in  connection  with  the  preceding  analysis  is  easily  carried  out. 

To  ensure  that  for  x"  (<r  €//)  all  maximal  index  sets  Q9  have  been  determined,  further 
analysis  is  necessary.  Let  7^  — {(S|.ft)>  • • • . (*«•',))  denote  the  set  of  all  pairs  of  indices  which 
belong  to  at  least  one  of  the  maximal  sets  Q°.  All  combinations  of  the  g pairs  of  indices 
(*i,ri),  it)  taken  h pairs  of  indices  at  a time  with  h - g.  g- 1,  ...,1  ban  be 

represented  in  a directed  graph  T (7V),  in  which  the  node  which  represents  the  combination 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


133 


<(5i,J|),  ... , (Sg.tg)  > is  the  source  and  the  | nodes  which  represent  all  combinations  of 
the  g pairs  of  indices  taken  one  pair  of  indices  at  a time  are  the  sinks  of  r ( Ta) . Each  node  of 
nr*)  represents  a potential  maximal  set  Q".  As  one  or  more  maximal  sets  Qa  have  already 
been  determined,  the  graph  T(T")  has  to  be  adjusted:  For  each  of  the  already  determined  max- 
imal index  sets  Q ” the  corresponding  node,  as  well  as  all  precessors  and  successors,  will  be 
deleted.  If  the  adjusted  graph  r(7vr)  has  no  node,  all  maximal  index  sets  Qa  have  been 
identified.  In  order  to  identify  further  maximal  index  sets  Qa  or  to  make  sure  that  no  further 
maximal  index  set  exists,  we  select  a sink  of  the  adjusted  graph  T ( Ta)  and  solve  the  linear  pro- 
gram (6)  for  the  corresponding  set  Q°.  If  (6)  has  no  optimal  solution,  the  node  which 
corresponds  to  Qa  and  all  its  precessors  can  be  deleted.  If,  however,  the  program  (6)  has  an 
optimal  solution,  a maximal  set  Q"  will  be  determined  and  the  corresponding  node,  all  preces- 
sors, and  all  successors  will  be  deleted. 

We  shall  now  utilize  the  above  results  and  describe  a systematic  method  to  enumerate  all 
dual  feasible  efficient  basic  solutions  for  (1)  and,  for  each  efficient  basic  solution  xa,  all  maxi- 
mal index  sets  Q".  Let  o-  be  the  index  of  that  efficient  basic  solution  currently  under  review;  v 

is  the  number  of  efficient  basic  solutions  identified  so  far.  The  set  V”  - |<r  + 1 v)  is  the 

index  set  of  the  unexplored  and  W"  - {1 <r)  is  the  index  set  of  the  explored  efficient 

basic  solutions,  including  that  one  which  is  currently  under  review.  Steps  9 to  23  of  the  algo- 
rithm apply  in  the  main  to  the  enumeration  of  all  dual  feasible  efficient  basic  solutions. 

STEP  8:  Put  a - 1,  v - 1,  F°-  {1},  and  W°  - 9 

STEP  9:  Select  xa  and  construct  the  associated  multiple-objective  transportation  tableau. 
Put  Va  - War  Wa-'  U \<r],  and  Ta  - #. 

STEP  10:  Construct  the  index  set  Pa  and  put  P - P". 

STEP  11:  If  P - 9,  go  to  Step  24.  Otherwise  go  to  Step  12. 

STEP  12:  Select  some  (s,t)  € Fand  solve  (6)  for  Qa  - |(s,r)}. 

STEP  13:  If  (6)  has  no  optimal  solution,  put  P - />\{(s.r) ) and  go  to  Step  11. 

STEP  14:  Include  in  Q"  all  pairs  of  indices  (ij)  € Pa  such  that  Q”  is  a maximal  set. 
Store  Qa  and  put  T*  - T"  U Q". 

STEP  15:  Put  Q - P n Q”  and  P - P\Q. 

STEP  16:  Select  some  (s.r)  € Q.  Determine  all  h feasible  basic  solutions  x\  z\  and  the 
index  sets  Jh  which  can  be  constructed  by  mal'ing  x„  a basic  variable.  Put  6-0. 

STEP  17:  Put  h - h + 1 

STEP  18:  If  there  is  any  y € V°  u W",  with  P - /*,  go  to  Step  19.  Otherwise  go  to 
Step  22. 

STEP  19:  If  h — h go  to  Step  20.  Otherwise  go  to  Step  17. 

STEP  20:  Put  Q - 0\{(i.r)J. 


134 


H.  1SERMANN 


STEP  21:  If  Q - • go  to  Step  11.  Otherwise  go  to  Step  16. 

STEP  22:  Put  v - v + 1,  x*  - x"  and  Va  - Va  U |i/).  Store  x",  J"  and  z"  and  go  to 
Step  19. 

STEP  23:  If  <r  — v print  xy  and  zy  for  y - 1,  ...  ,v.  Otherwise  put  a — a + 1 and  go  to 
Step  9. 

The  following  steps  apply  to  the  construction  of  the  graph  T ( Ta)  and  the  process  of  identifying 
all  further  maximal  sets  Qa. 

STEP  24:  Construct  the  graph  T ( T”)  and  adjust  it  according  to  the  located  maximal  sets 

Q 


STEP  25:  If  the  adjusted  graph  T (PO  has  at  least  one  node,  go  to  Step  26.  Otherwise  go 
to  Step  23. 


STEP  26:  Select  a sink  of  T (7W)  and  solve  the  linear  program  (6)  for  the  corresponding 
set  Qa. 

STEP  27:  If  the  linear  program  (6)  has  no  optimal  solution,  adjust  T (Ta)  and  go  to  Step 
25.  Otherwise  go  to  Step  28. 


STEP  28:  Determine  a maximal  set  Q”  and  adjust  V(Ta).  Store  Qa  and  go  to  Step  25. 


To  illustrate  the  preceding  steps  of  the  multiple-objective  transportation  algorithm,  we 
shall  depart  from  the  multiple-objective  transportation  tableau  2 with  the  initial  efficient  basic 
solution  x1.  In  Step  10  we  construct  Px  ■■  ((1,1),  (1.2),  (3.2)}  and  put  P - Px.  Since  P s*  #. 
in  Step  12  we  select  (s,f)  - (3.2)  and  solve  the  linear  program  (6),  which  becomes 


max 


LKl  + >2  + >3 


9w„ 

+ 5wu 

+ 5w„  + y, 

- 1 

v 

-•"'ll 

-lw,2 

-6w„ 

+?2 

-)• 

5h-i, 

-4w,2 

-7wj2 

+ * 3 

- 1 

"'ll.  "'ll*  J’i. 

y»yj 

*0 

The  optimal  solution  for  this  linear  program  is  w,,  - w12  - .p,  - 0,  w}2  - 1/5, 

_pj  — 11/5,  _Pj  - 12/5.  If  the  pair  of  indices  (1,1)  or  (1,2)  is  introduced  into  Q'  - ((3,2)},  the 
linear  program  (6)  has  no  optimal  solution.  Hence  Qx  — ((3,2)}  is  a maximal  index  set,  which 
will  be  stored.  We  obtain  T'  - ((3,2)}.  In  Step  15  we  put  Q - ((3.2)}  and 
P - ((1,1),  (1,2)).  We  pass  to  Step  16;  xn  is  the  incoming  variable.  There  is  only  one  feasi- 
ble basic  solution  to  which  can  be  revised:  x»  is  the  new  basic_variable  and  xJ3  will  become 
a nonbasic  variable  in  the  new  efficient  basic  solution.  Hence  6*1.  As  this  new  efficient 
basic  solution  has  not  yet  been  identified,  we  continue  with  Step  22.  We  put  v « 2,  Vx  — (2), 
and  label  the  new  efficient  basic  solution  x1.  We  store  x2.  z2,  and 
J 1 - ((1,3),  (2,2),  (2.3).  (3,1),  (3,2)}.  The  values  of  x 2 and  r2  can  be  read  from  multiple- 
objective  transportation  tableau  3 (Figure  3). 

Since  h - h - 1,  and  Q - #,  but  P * #,  we  continue  with  Step  12.  For  Qx  - ((1.1)}  the 
linear  program  (6)  has  no  optimal  solution.  Hence  we  put  P ■ ((1,2)).  Also,  for  O'  — ((1,2)} 
the  linear  program  (6)  has  no  optimal  solution.  As  P — #,  we  proceed  with  Step  24  and  con- 
struct the  graph  T(7'),  which  merely  consists  of  the  node  1(3,2)  1.  Qx  - ((3,2)}  has  already 
been  identified  as  a maximal  index  set;  the  corresponding  node  will  be  deleted.  Since  the 
sdjusted  graph  r(F')  has  no  node  we  proceed  with  Step  23.  As  P - • but  «r  - 1 * v - 2,  we 
put  <r  - 2 snd  continue  with  Step  9. 


136 


H.  ISERMANN 


turns  out  to  be  a maximal  index  set.  We  store  Q2  - ((1,2),  (2. 1)}  and  put 
f2-  1(1,2), )(2,1)|.  In  Step  15,  we  obtain  Q - {(1,2),  (2, 1)}  and  F - {(3,3)}.  We  come  to 
Step  16  and  select  (s.r)  - (1.2). 


There  is  only  one  feasible  basic  solution  to  which  x2  can  be  revised;  xn  is  the  new  basic 
variable  and  x22  will  become  a nonbasic  variable  in  this  new  efficient  basic  solution.  This 
efficient  basic  solution  has  not  been  identified  so  far.  In  Step  22  we  put  v - 3,  and  V 1 — {3} 
and  label  the  new  efficient  basic  solution  x3.  We  obtain  x3  - (0,65,35,0,0, 125,60, 15,0) r, 
with  z3- (6*5,1.030, 1,160) T and  J2  - ((1,2),  (1,3),  (2, 3),  (3,1),  (3, 2)).  In  step  20  we  put 
Q “ 1(2,1)),  and  we  proceed  with  Step  16.  We  select  (sir)  - (2,1).  With  x2,  being  the 
incoming  variable,  a new  efficient  basic  solution  can  be  enumerated  which  has  not  been 
identified  so  far.  In  Step  22  we  put  v — 4,  and  V2  — (3,4)  and  label  the  new  efficient  basic 
solution  x*.  We  obtain  x4- (0,0,100,60, 5, 60, 0,75, 0)r,  with  z4  - (900,  795,  1, 180)  T and 
J*  — ((1,3),  (2,1),  (2, 2).  (2, 3),  (3, 2)).  Since  Q «•  9 but  P ^ 9,  we  proceed  with  Step  12  and 
solve  the  linear  program  (6)  with  Q2  - ((3,3)}.  This  linear  program  has  an  optimal  solution. 
The  set  (?2-((3,3))  turn*  out  to  be  a maximal  set.  We  store  G2-{(3,3))  and  put 
r2-  ((1,2),  (2, 1),  (3,3)).  To  the  pair  of  indices  (s,r)  - (3,3),  the  efficient  basic  solution  x1 
has  to  be  associated.  As  we  get  P - 9,  we  shall  examine  whether  or  not  further  maximal  index 
sets  Q 1 can  be  determined.  We  construct  the  graph  l'(7'2)  (Figure  4). 


Since  two  maximal  index  sets  Q2  have  been  determined,  r(7’2)  has  to  be  adjusted  accordingly. 
The  adjusted  graph  r(7’2)  has  no  node;  hence,  all  maximal  index  sets  Q2  have  been  deter- 
mined. As  V2  * 9 the  algorithm  of  Phase  II  is  continued  with  the  exploration  of  x3. 

At  the  end  of  Phase  II  seven  efficient  basic  solutions  have  been  enumerated: 


x1  - (0.0, 100,0,80,45,60,0, 1 5)r. 
x2  - (0,0, 100,0,65.60,60,1 5,0) r,  z2 
x 2 - (0,65,35,0,0, 125,60,1 5,0) r,  :2 
x 4 - (0,0,1 00,60,5,60,0,75,0) r,  z* 
x5  - (0,5.95,60,0,65.0.75.0) r,  z* 
x*  - (60,0.40,0,5,1 20,0,75,0) r,  z* 
x7  - (60,5,35,0,0, 1 25,0,75,0) r,  z7 


(285,  1,185,  1 ,525) r.  / - ( ( 1 .3) ,(2,2) .(2.3), (3, 1 ) . (3.3) ); 
(360,  1,095,  1 ,420) r,  J2  - ((1,3), (2,2), (2,3), (3.1), (3,2)); 
(685,  1,030,  1,160) r,  J2  - |(1, 2), (1.3), (2,3), (3, 1), (3, 2)|; 
(900,  795,  1,180) r,  /•  - ((1.3). (2,1), (2,2), (2,3), (3.2)); 
(925,  790,  1.1 60) r,  / - ((1.2), (1.3), (2.1), (2.3). (3.2)); 
(1,200,675,  1 ,300) r,  / - ((1,1), (1,3), (2, 2), (2,3), (3,2)); 
(1,225,670,  1,280) r,  f - ((1.1), (1.2), (1,3), (2,3), (3, 2)). 


As  shown  by  the  solution  graph  G,  the  algorithm  of  Phase  II  may  be  regarded  as 
enumerating  all  the  nodes  of  a connected  graph  by  examining  all  the  neighbors  of  those  nodes 
already  enumerated.  For  the  example  considered  the  solution  graph  O may  be  depicted  as 
shown  in  Figure  5. 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


137 


I 


THEOREM  11:  Let  xy  be  a dual  feasible  efficient  basic  solution  for  the  multiple-objective 
transportation  problem.  Then  xy  has  been  determined  at  the  conclusion  of  the  described  algo- 
rithm of  Phase  II. 

PROOF:  Let  x1  be  the  initial  dual  feasible  efficient  basic  solution.  If  Jx  ^ Jy,  then  by 
Theorem  10  there  exists  a sequence  of  adjacent  xa(<r  € H)  linking  x1  with  xy.  The  algorithm 
of  Phase  II  uncovers  each  x'of  this  sequence  including  xy. 

4.  PHASE  III:  ENUMERATION  OF  THE  SET  OF  ALL  EFFICIENT  SOLUTIONS 

In  Phase  II,  for  each  efficient  basic  solution  xa  (<r  € H)  one  or  more  maximal  index  sets 
Qa  have  been  determined  and  stored.  Let  us  form  the  index  sets  Ra  - Qa  U Ja  for  each 
maximal  index  set  Qa.  If  we  consider  the  thus-determined  index  sets  R",  we  may  discover  the 
same  index  set  several  times  and  subsets  of  some  index  set  R a as  well.  We  shall  form  5F  index 
sets  Uw  (v  - 1,  ..  ,n)  from  the  sets  R"  with  the  following  property:  For  each  Ra  there  exists 

some  ir  €{1 n)  such  that  R"  C Uw,  and  for  each  Uw  there  exists  at  least  one  set  R' 

such  that  Ra  - U".  We  further  require  U*  £ U *'  for  no  it'  * ir"(n',  it"  €(1,  ... ,??}),  and 
thus  we  obtain  a minimal  number  W of  sets  Uw.  Let  /'  - [a  € H \ Ja  £ l/*).  We  are  now  in 
a position  to  assign  to  each  set  Uw  a convex  subset  of  the  set  of  efficient  solutions  for  the 
multiple-objective  transportation  problem  (1), 

S'  - jx  | x - £ a„xa  ( ]£  a„  - 1,  a„  £ 0 for  all  <r  € /'  1. 

S <rilw  [*€/'  I 

THEOREM  12:  Sw  is  a subset  of  S°  for  each  it  - 1 ir. 

PROOF:  The  construction  of  Uw  ensures  the  existence  of  some  R'  (<r  € H)  such  that 
U"  - Ra.  This  implies  that  the  linear  program  (6)  with  Q”  - R°  7'  has  an  optimal  solution 
and  also  that  the  dual  linear  program  (7)  has  an  optimal  solution  X'.  Each  x € S'  is  an  optimal 
solution  for  (8)  if  X - X',  and,  according  to  Theorem  1 in  [3],  it  is  an  efficient  solution  for  the 
multiple-objective  transportation  problem  (1).  This  completes  the  proof. 

THEOREM  13:  Let  x°-(xp|(  ....  x®„)  be  an  efficient  solution  for  the  multiple- 
objective  transportation  problem  (1).  Then  there  exists  a tr  € {1 5F)  such  that  x°€  S*. 

PROOF:  Let  x°  be  an  efficient  solution  for  (1).  According  to  Theorem  1 in  (3),  there 
exists  a X°  € R \ such  that  x°  is  optimal  for  (8)  if  X - X®.  Let  //°  denote  the  index  set  of  all 
dual  feasible  efficient  basic  solutions  xa  which  are  optimal  for  (8)  when  X - X°.  Then  x°  can 
be  represented  in  the  form 

x°  - ^ o^x'  | £ a„  - 1,  a„  £ 0 for  all  <r  € H°  . 


I 

' 


N 


138 


H.  ISERMANN 


Let  y°-  £ Ja.  For  some  y €H°  the  linear  program  (7)  for  Qy  - Q°  - J°\Jy  has  an 

<wiH° 

optimal  solution,  so  that  the  linear  program  (6),  which  is  the  dual  of  (7),  also  has  an  optimal 
solution.  The  systematic  search  for  all  maximal  index  sets  Q a implies,  in  connection  with 
Theorems  8 and  9,  that  in  Phase  II  some  maximal  index  set  Qy  has  been  constructed  such  that 
0°  £ Qy.  From  Qy  U Jy  £ Uw  for  some  ir  €{1,  , w)  the  assertion  follows. 

The  set  of  all  efficient  solutions  for  the  multiple-objective  transportation  problem  is  then 
given  by 

S°-  U S'. 

IT— 1 


Provided  that  all  efficient  basic  solutions  for  (1)  are  nondegenerate  n is  the  minimal  number  of 
convex  sets  of  efficient  solutions  S'.  However,  in  the  case  of  degeneracy  it  may  happen  that 
S'  £ S'"  for  some  ir',  ir"'€{l,  ....  w).  In  that  case,  the  respective  sets  S'  are  easily 
identified.  We  shall  now  present  the  iteration  step::  of  Phase  III. 

STEP  29:  Construct  a minimal  number  ir  of  index  sets  Uw  from  the  complete  list  of 
index  sets  Ra  - Qa  U Ja  (<r  €//).  Put  it  - 0. 

STEP  30:  Put  it  - ir  + 1. 

STEP  31:  Construct  the  index  set  /'  and  the  corresponding  convex  set  of  efficient  solu- 
tions S’. 

STEP  32:  If  it  - ir,  go  to  Step  33.  Otherwise  go  to  Step  30. 

STEP  33:  Print  the  sets  S'  for  all  it  - 1,  ....  ir. 

To  illustrate  Phase  III  of  the  algorithm  we  shall  now  determine  the  sets  S'  for  our  exam- 
ple. Recall  that,  for  <r  - 1,  {(3,  2)}  is  the  only  maximal  index  set  Q".  We  construct  the  index 
set  R1  - {(1,  3),  (2,  2),  (2,  3),  (3,  1),  (3,  2),  (3,  3)}.  For  <t  - 2,  two  maximal  index  sets  Q” 
have  been  determined:  {(1,  2),  (2,  1)|  and  {(3,  3)}.  We  obtain  two  index  sets  R J,  of  which  the 
later  is  identical  with  R':  {(1,  2),  (1,  3),  (2,  1),  (2,  2),  (2,  3),  (3,  1),  (3,  2))  and  {(1,  3), 
(2,  2)  (2,3),  (3,  1),  (3,  2),  (3  ,3)}.  From  all  sets  R"  which  have  been  generated  from  the 
maximal  index  sets  Q"  (<r  — 1,  ....  7),  ir  — 3 index  sets  Uw  can  be  constructed: 

U'  - {(1,3), (2, 2), (2, 3), (3,1), (3, 2), (3,3)}; 

U1 - {(1, 2), (1,3), (2,1), (2, 2), (2,3), (3,1), (3, 2)}; 

t/3 - {(1,1), (1,2), (1,3), (2,1), (2,2), (2, 3), (3, 2)}. 

The  respective  index  sets  /'  are:  /'  - {1,2},  I1  ■*  {2,3,4,5},  and  / 3 ” {4, 5,6,7}.  The  set  of  all 
efficient  solutions  is  given  by 

S°-  U S’, 

»-l 


where 


EFFICIENT  SOLUTIONS  FOR  LINEAR  MULTIPLE-OBJECTIVE  TRANSPORTATION 


139 


REFERENCES 

111  DANTZIG,  G.B.,  Linear  Programming  and  Extensions  (Princeton  University  Press,  Prince- 
ton, New  Jersey,  1963). 

12]  GEOFFRION,  A.M.,  "Solving  Bicriterion  Mathematical  Programs,"  Operations  Research  15, 

39-54  (1967). 

13]  ISERMANN,  H.,  "Proper  Efficiency  and  the  Linear  Vector  Maximum  Problem,"  Operations 

Research  22,  189-191  (1974). 

[4]  ISERMANN,  H.,  "On  Some  Relations  Between  a Dual  Pair  of  Multiple  Objective  Linear 
Programs,"  Zeitschrift  fur  Operations  Research  22,  33-41  (1978). 

151  LASDON,  L.  S.,  Optimization  Theory  for  Large  Systems  (Macmillan,  New  York,  1970). 

[6]  MANGASARIAN,  O.I.,  Nonlinear  Programming  (MacGraw-Hill,  New  York,  1969). 

[71  SIMONNARD,  M.,  Linear  Programming  (Prentice-Hall,  Englewood  Cliffs,  N.J.,  1966). 

[8]  SRIN1VASAN,  V.,  and  G.L.  THOMPSON,  "Determining  Cost  vs.  Time  Pareto-Optimal 
Frontiers  in  Multi-Modal  Transportation  Problems,"  Transportation  Science  II,  1-19 
‘ (1977). 


COORDINATED  REPLENISHMENTS  OF  ITEMS 
UNDER  TIME- VARYING  DEMAND: 
DYNAMIC  PROGRAMMING  FORMULATION* 


I 


Edward  A Silverf 

Department  de  Mathfmatiques 
Ecole  Polytechnique  Ffdfrale  de  Lausanne 
Lausanne.  Switzerland 

ABSTRACT 

We  consider  ■ group  (or  family)  of  items  having  deterministic,  but  time- 
varying,  demand  patterns.  The  group  is  defined  by  a setup-coat  structure  that 
mikes  coordination  attractive  (a  major  setup  cost  for  each  group  replenishment 
regardless  of  how  many  of  the  items  are  involved).  The  problem  is  to  deter- 
mine the  timing  and  sizes  of  the  replenishments  of  all  of  the  items  so  as  to 
satisfy  the  demand  out  to  a given  horizon  in  a cost-minimizing  fashion.  A 
dynamic  programming  formulation  is  illustrated  for  the  case  of  a two-item  fam- 
ily. It  is  demonstrated  that  the  dynamic  programming  approach  is  computation- 
ally reasonable,  in  an  operational  sense,  only  for  small  family  sizes.  For  large 
families  heuristic  solution  methods  appear  necessary. 

1.  INTRODUCTION 

In  this  paper  we  consider  a group  (or  family)  of  items  for  which  the  setup-cost  structure  is 
such  that  coordination  of  replenishments  of  the  individual  items  is  attractive.  In  addition,  the 
individual-item  demand  patterns,  although  assumed  deterministic,  can  vary  with  the  time 
period.  Contexts  in  which  coordination  is  of  interest  include  those  cases  in  which  a group  of 
items  share 

(i)  a common  supplier; 

(ii)  a common  mode  of  transportation  (e.g.,  a freight  car  or  an  ocean-going  container); 

(iii)  the  same  production  facility. 

Time-varying  demand  patterns  permit  us  to  treat  a number  of  important  situations  includ- 
ing 

(i)  the  fabrication  of  component*  and  subasaemblies  in  a multistage-production  context 
— once  the  master  schedules  of  finishing  operations  of  the  various  end  items  are  specified,  the 
requirements  of  components  and  subassemblies  through  time  are  essentially  deterministic  and 


*The  research  leading  to  this  paper  was  supported  by  the  National  Research  Council  of  Canada  Orant  No.  A 741 7. 
tOn  sabbatical  leave  from  the  Department  of  Management  Sciences,  University  of  Waterloo,  Waterloo,  Ontario,  Cana- 
da. 


141 


142. 


E.  A.  SILVER 


time-varying  (even  for  level  end-item  usage,  the  requirements  for  a particular  component  will 
vary  with  time  because  of  a batching  phenomenon,  where  the  batches  of  different  end  items 
requiring  the  same  component  will  likely  be  produced  at  different  times); 

(ii)  production  to  a firm  contract  where  the  specified  delivery  quantities  vary  with  time; 

(iii)  items  with  seasonal  demand. 

The  single-item,  time-varying  problem  has  received  considerable  attention  in  the  literature 
beginning  with  the  fundamental  dynamic  programming  treatment  by  Wagner  and  Whitin  123]. 
A number  of  authors  (see,  for  example,  Blackburn  and  Kunreu.ther  [1],  Crabill  and  Jaquette 
[3],  Crowston  and  Wagner  [4],  Eppen  et  al.  [81,  Florian  and  Klein  [91,  Jagannathan  and  Rao 
[111,  Kunreuther  and  Morton  [12,13],  Lambrecht  [14],  Love  [151,  Lundin  and  Morton  [16], 
Newsom  [17],  Sawaki  [18],  Swoveland  [21],  Wagner  [22],  Zabel  [24]  and  Zangwill  [25])  have 
generalized  the  Wagner- Whitin  model  and  associated  results.  In  addition,  Diegel  [16]  has 
treated  the  special  case  where  the  demand  rate  changes  linearly  with  time.  Several  heuristic 
solutions  have  also  been  proposed  for  the  basic  Wagner-Whitin  problem.  Examples  include  De 
Matteis  and  Mendoza  [5]  and  Silver  and  Meal  [20]. 

For  the  case  of  level  demand  patterns  and  the  coordinated  setup-cost  structure  to  be  con- 
sidered in  the  present  paper,  Goyal  [10]  has  developed  an  iterative  algorithm  that  finds  the 
optimal  solution  within  a particular  class  of  policies.  Brown  [2]  and  Doll  and  Whybark  [7]  have 
also  developed  iterative  solution  procedures;  however.  Silver  [19]  has  proposed  a simpler, 
heuristic  approach. 

In  Section  2,  the  physical  process  is  described  and  the  underlying  assumptions  of  the 
model  are  laid  out.  Section  3 presents  certain  key  properties  of  the  optimal  solution.  For  the 
case  of  just  two  items  in  the  family,  these  properties  are  exploited  to  develop  a dynamic  pro- 
gramming formulation  in  Section  4.  A numerical  illustration  is  provided  in  Section  5.  Section 
6 is  concerned  with  the  general  case  of  several  items  and  the  associated  rapid  growth  in  the 
number  of  states  that  have  to  be  considered  as  a function  of  the  number  of  items  in  the  family 
and  the  number  .of  time  periods  out  to  the  chosen  planning  horizon.  Finally,  Section  7 pro- 
vides some  concluding  remarks. 


2.  THE  PHYSICAL  PROCESS  AND 
UNDERLYING  ASSUMPTIONS 

As  discussed  above,  we  consider  a group  of  items  where  savings  in  replenishment  costs 
are  possible  through  the  coordination  of  replenishments  of  two  or  more  items.  These  items  are 
faced  with  known,  but  likely  time-varying,  demand  patterns  out  to  some  specified  time  horizon. 

The  supply  of  each  item  is  depleted  according  to  its  demand  pattern.  When  the  inventory  of 
one  or  more  items  reaches  a critically  low  level,  a family  replenishment,  involving  one  or  more  > 

items,  is  initiated.  The  problem  is  to  select  the  timing  and  makeup  of  these  replenishments  in 
a cost-effective  fashion. 

The  model  that  we  shall  develop  is  based  on  a number  of  assumptions,  many  of  which 
could  be  relaxed  (e.g.,  concave  instead  of  linear  carrying  costs).  We  restrict  attention  to  this 
set  of  assumptions  because  the  problem  was  observed  in  this  particular  form  within  practical 


COORDINATED  REPLENISHMENTS  UNDER  TIME-VARYING  DEMAND 


143 


contexts.  In  addition,  although  the  dynamic  programming  approach,  could  handle  less-rigid 
assumptions,  our  ultimate  goal  is  to  develop  simple,  heuristic  procedures,  in  which  case  the 
solution  for  the  special  situation  considered  (which  is  likely  to  be  easier  to  achieve  than  the 
solution  for  the  more  general  situation)  would  still  be  of  substantial  value  to  practitioners  facing 
coordinated  control  under  time-varying  demand  patterns. 

The  specific  assumptions  include 

(1)  We  consider  a family  of  n items  where  the  family  is  defined  according  to  the  following 
setup-cost  structure.  There  is  a major  setup  cost  A,  in  dollars,  associated  with  any  replenish- 
ment involving  one  or  more  of  the  items.  There  is  a minor  setup  cost  ah  in  dollars,  incurred 
whenever  item  i is  included  in  a replenishment  of  the  family. 

(2)  The  demand  for  item  i is  deterministic  but  varying  with  time.  The  symbol  Dim 
represents  the  demand  for  item  /'in  period  m (/  - 1,2,  ...  ,ir,  m - 1,2,  ...  ,N,  where  iVis  the 
planning  horizon).  In  the  subsequent  formulation,  time  will  be  measured  in  the  conventional 
forward  fashion. 

(3)  There  are  no  quantity  discounts  and  the  unit-variable  acquisition  cost  of  any  specific 
item  is  assumed  constant,  independent  of  the  period  of  acquisition. 

(4)  The  requirements  for  any  period  m must  be  on  hand  at  or  before  the  start  of  period  m 
— this  assumption  was  motivated  by  practical  conditions  in  which  the  requirements  are  often 
input  materials  for  further  production;  to  provide  short-range  scheduling  flexibility,  all  of  the 
requirements  for  a period  are  needed  at  the  start  of  the  period. 

(5)  The  initial  inventory  of  each  item  is  at  the  zero  level,  hence  a group  replenishment 
involving  all  items  must  take  place  at  the  start  of  period  1.  (The  extension  to  the  more  general 
case  of  nonzero  initial  inventories  is  straightforward). 

(6)  There  is  a carrying  charge  of  h , dollars/unit  of  item  / carried  from  one  period  to  the 
next  (i.e.,  carrying  charges  are  based  on  the  period-ending  inventories). 

(7)  We  do  not  consider  production  constraints  — this  assumption  is  an  obvious  weakness 
in  the  production  context.  However,  the  problem  is  certainly  of  importance  to  distributors 
where  production  constraints  are  nonexistent. 

As  mentioned  in  the  introduction,  there  has  been  considerable  published  material  on 
single-item  extensions  of  the  basic  Wagner- Whitin  model.  In  addition,  Crabill  and  Jaquette  [3] 
treated  a two-item  problem  having  time-varying  demands.  However,  their  context  did  not 
include  a coordinated  setup-cost  structure,  but  rather  the  added  constraint  that  in  any  period  at 
most  one  item  can  be  replenished.  Crabill  and  Jaquette,  as  well  as  the  authors  of  all  the  other 
referenced  material  on  the  single-item  problem,  use  less-restrictive  assumptions  (except  for  the 
key  assumption  of  a coordinated  setup-cost  structure)  than  are  utilized  in  the  current  paper. 


3.  PROPERTIES  OF  THE  OPTIMAL  SOLUTION 

In  this  section  we  present  four  properties  that  the  optimal  policy  must  satisfy.  Proofs  of 
only  the  first  and  fourth  are  provided.  The  other  two  can  be  proved  by  methods  identical  to 
those  already  existing  in  the  literature  (see,  e.g.,  Crabill  and  Jaquette  [3]). 


144 


E.  A.  SILVER 


We  first  introduce  some  additional  notation:  Let  1,  m be  the  inventory  of  item  / at  the  end 
of  period  m before  any  delivery  (production)  at  the  start  of  period  m + 1,  and  let  Qi  m be  the 
amount  of  the  replenishment  of  item  i in  (at  the  start  of)  period  m. 

PROPERTY  1:  There  exists  an  optimal  solution  U*,Q*)  such  that 
..  _ for  / - 1.2. 


In  words,  we  never  replenish  an  item  until  its  inventory  drops  to  zero.  This  is  a property  of  the 
basic  Wagner- Whitin  model.  If  the  starting  inventory  of  an  item  is  nonzero,  then  this  property 
need  not  hold  for  only  the  first  replenishment. 

, PROOF:  Consider  any  solution  (Q.I)  such  that  Qlm  Iim_x  > 0 for  at  least  one  i,m  pair. 

Let  k — min  (m  | Qi  m ll  m > 0). 

In  case  there  is  more  than  one  i with  (?,.*  > 0,  let 

j “ min  (/|  <?«//.*-!  > 0) 

and  t be  the  time  of  the  previous  replenishment  of  product  j. 

Consider  an  alternative  policy  A obtained  from  (Q.I),  as  follows:  The  inventory  of  (Q.I) 
at  the  end  of  period  fc-1  is  subtracted  from  the  replenishment  quantity  in  period  t and  added  to 
the  replenishment  quantity  at  the  start  of  period  k.  Mathematically,  we  have 

Qf.t  ” Qj.t  — lj.k- 1» 

Qf.k  “ Qj.k  + Ij.k- 1- 

lfm  ■ / j'Hi  I m • t,t+l , . • . pk  — ■ 1. 

Qkm  “ Qt.m>  all  other  i.m, 

I*m  “ l.m.  all  other  i.m. 

From  assumptions  (3)  and  (6)  it  follows  that 
Costs  (Q.I)  - Costs  (QA,I' 0 - Ijk-\(k-t)hj  > 0, 
i.e„  policy  A improves  on  the  performance  of  the  initial  policy.  Furthermore,  policy  A has 
Qkm  I*m-\  “ 0 for  m < * . / < J. 

This  reasoning  is  repeated,  if  necessary,  for  m - k,  i > J,  and  then,  if  necessary,  for  m > k, 
all  i.  Thus  we  eventually  end  with  a solution  satisfying  Property  1 that  costs  less  than  policy 
(Q.I). 

PROPERTY  2:  If  we  restrict  ourselves  to  policies  satisfying  Property  1,  the  order  quantity 
placed  for  item  i at  the  start  of  period  m must  be  one  of 


COORDINATED  REPLENISHMENTS  UNDER  TIME-VARYING  DEMAND 


145 


PROPERTY  3:  Again,  restriction  to  policies  satisfying  Property  1 (or  2)  implies  that  , 
must  be  one  of  0,  Di  m,  Dl  m + Z>/m+, £ D,  r 

r—m 

PROPERTY  4 (Planning  Horizon  Concept):  Consider  t < m.  If  we  have 
(1)  ( m-t)h , D,m  > A + a„ 

then  the  requirements  Di  m (and  those  for  any  later  period  for  item  /)  would  not  be  included  in 
an  order  placed  at  the  start  of  period  t (namely  Q,,). 

PROOF:  ( m—t)h(  D,m  represents  the  carrying  costs  associated  with  the  requirements 
D,  m if  they  are  included  in  a replenishment  at  the  start  of  period  t.  If  inequality  (1)  holds,  this 
implies  that  it  costs  less  to  make  a special  group  replenishment,  involving*  only  item  /,  at  the 
start  of  period  m.  Thus,  any  solution  in  which  inequality  (1)  was  violated  could  be  improved 
upon  by  removing  Dl  m from  the  replenishment  at  the  start  of  period  i and  instead  placing  it  in 
a replenishment  at  the  start  pf  period  m. 

4.  DYNAMIC  PROGRAMMING  FORMULATION  FOR 

THE  CASE  OF  TWO  ITEMSt 

Let  ( jx,  j2)  represent  the  inventory  state  (before  a replenishment)  at  the  start  of  a partic- 
ular period  where  J | is  the  number  of  periods  of  supply  of  item  1,  and  J2  is  the  number  of 
periods  of  supply  of  item  2.  Then  by  Properties  1 and  3 we  know  that  we  can  restrict  the  states 
from  which  we  place  an  order  to  those  included  in  three  sets: 

(i)  Oi  — positive  integer,  j2  — 0); 

(ii)  (J | — 0,  J2  — positive  integer); 

(iii)  0,  - 0,  j2  - 0). 

Now  let  /,0.0)  be  the  minimum  cost  in  periods  n,n+ 1,  ...  ,N  if  we  are  at  the  start  of 
period  n (prior  to  an  order)  with  j > 0 periods  of  supply  left  for  item  1 and  0 periods  left  for 
item  2,  f„(Qj)  be  the  same  but  will*  items  1 and  2 interchanged,  and  /„(0,0)  be  the  minimum 

cost  in  periods  n,n+l N if  we  are  at  the  start  of  period  n and  both  items  must  be  ordered. 

When  item  1 is  ordered  we  shall  let  the  decision  variable  be  rlt  the  replenishment  quantity, 
expressed  as  a number  of  periods  of  supply  (we  know  from  Property  2 that  7,-1 ,2, 3,  ... ). 
T2  will  be  the  similar  quantity  for  item  2. 

We  shall  illustrate  the  development  of  the  recursion  relationship  for  the  case  of  a state 
0,0)  with  J > 0.  The  state  0.0)  with  J > 0 at  the  start  of  period  n implies  that  only  item  2 
has  to  be  replenished  at  the  start  of  that  period.  Both  the  state  variable  J and  the  decision  vari- 
able T2  can  take  on  any  of  the  values  1,2 N—n+l.  (Of  course,  for  a specific  numerical 

example  use  of  Property  4 could  reduce  the  range  of  one  or  both  of  the  variables.)  The  next 
ordering  state  (after  the  replenishment  at  the  start  of  period  n)  depends  upon  the  relative 
values  of  7 and  Tt  There  are  two  possibilities: 


•Thara  may  possibly  ha  ■ tighter  horizon  proparty  than  Proparty  4,  in  that  a rapteniahment  of  item  t at,  the  start  of 
period  m need  sol  incur  the  full  major  east  A (the  requirement*  of  another  item  may  dictate  a sroup  replenishment  at 
that  time).  % 

tA  numerical  illustration  of  the  solution  procedure  will  be  shown  for  a six-period  problem  in  Section  $. 


146 


E.  A.  SILVER 


CASE  1:  if  T2  < j,  item  2 again  runs  out  before*  item  1 and  the  next  ordering  state  is 
( j-Tj.O ) at  the  start  of  period  n+T2. 


CASE  2:  if  T2  > J , item  1 runs  out  before*  item  2 and  the  next  ordering  state  is 
(O.Tj-J)  at  the  start  of  period  n+J.  Thus,  we  have,t  for  «-  2,3,  . . . ,jV-1, 
J - 1.2 N-n+l, 

(2)  /-C/.0)  - A + a2  + min  , B(J,T2), 

where  A + a2  represents  the  setup  costs  of  a replenishment  involving  only  item  2,  and  B(J,T2) 
represents  the  carrying  cost  from  the  start  of  period  n to  the  time  of  the  next  replenishment 
plus  the  functional  value  at  that  future  time. 


+ h2J  £ D2„+)-\  + fH+J(0,T2- j). 

i-j+i 

In  expression  (3)  the  first  summation  term  expresses  the  carrying  costs  (from  the  start  of 
period  n)  of  the  requirements  of  item  1 for  periods  n+1  to  n+T2—l  inclusive,  the  second  sum- 
mation term*  represents  the  cost  of  carrying  the  requirements  of  periods  ft +72, 
ft+Tj+l,  ....  n+J  for  item  1 through  the  T2  periods  until  the  time  of  the  next  replenishment 
(at  the  start  of  period  ft+7'2),  and  the  third  summation  term  gives  the  carrying  costs  (from  the 
start  of  period  «)  for  the  requirements  of  item  2 for  periods  ft+1  to  »+7y-l,  inclusive.  Of 
course,  the  last  term  represents  the  functional  value  of  being  in  the  next  ordering  state, 
(J-T2, 0),  at  the  time  of  the  next  order,  the  start  of  period  n+T^  The  terms  in  expression  (4) 
can  be  similarly  interpreted  when  one  realizes  (for  T2  > J)  that  item  1 will  now  run  out  at  the 
start  of  period  n+y  when  there  will  still  be  T2-J  periods  of  supply  of  item  2 remaining.  Recur- 
sion relationships,  closely  paralleling  those  of  equations  (2),  (3),  and  (4),  can  be  developed  for 
the  other  two  types  of  ordering  states. 


Finally,  the  boundary  conditions  are 

/#(0, 0)  ■ 4 + a | + a 2, 

//v(l,0)  “ A + a2, 


•If  T2  - J,  replace  *«saHi  runs  out  before*  or  "runs  out  before’  by  "runs  out  at  the  same  time  as*. 

tFor  »«l  the  only  slate  that  seed  be  oonsMoreil  is  the  given  initial  state,  namely  (0,0).  In  addition,  a ate  of  boundary 

conditions  trill  be  apedfled  for  the  last  period  (»-V). 

♦The  summation  is  defined  to  be  aero  when  Tj-;. 


COORDINATED  REPLENISHMENTS  UNDER  TIME-VARYING  DEMAND 


147 


5.  NUMERICAL  ILLUSTRATION  WITH 
TWO  ITEMS  («-2) 

We  consider  a six-period  horizon  (N—  6)  with  the  two  demand  patterns  as  shown  in  Table 

1. 

Table  1 


Period,  m 

Pl,m 

D2.m 

1 

10 

10 

2 

70 

10 

3 

80 

20 

4 

5 

40 

5 

5 

10 

6 

70 

10 

Other  relevant  characteristics  are  A — $40,  a,  — $10,  a2  — $20;  h\  — $3/unit/period;  and  h2 
— $1 /unit/period. 

The  various  possible  states  (taking  advantage  of  Property  4),  their  associated  functional 
values,  and  the  optimal  values  of  the  decision  variables  are  shown  in  Table  2.  From  the  table 
we  trace  back  the  solution  as  follows:  The  initial  state  is  (0,0)  at  n-1.  From  the  last  line  of  the 
table  we  see  that  the  initial  7” s are  7*1—1  and  T2-2.  This  leads  to  the  next  ordering  state  being 
(0,1)  at  /i -2.  Now,  the  table  gives  7*1—1.  Hence,  the  next  ordering  state  is  (0,0)  at  /t— 3.  The 
appropriate  row  of  the  table  indicates  that  7*, — 1 , 7*2-2.  Continuing  in  this  fashion  we  develop 
the  entire  solution  shown  in  Table  3. 

6.  THE  GENERAL  CASE  OF  n ITEMS 

We  saw  in  the  previous  section  that  for  the  case  of  two  items  we  had  to  consider  the  max- 
imum number  of  states  (ie.,  ignoring  any  potential  benefits  of  Property  4)  shown  in  Table  4. 
Thus,  the  total  number  of  states  is  given  by 

£*  (2  * + 1)  + 1. 

*-i 

which  simplifies  to  N1. 

Now,  when  we  go  to  n > 1 items,  conceptually  the  approach  does  not  change.  However, 
the  number  of  possible  states  proliferates.  When  there  are  k periods  remaining,  the  possible 
ordering  states  include:  only  1 item  is  ordered  (there  are  n choices  of  the  item  and  each  of  the 
other  items  can  be  in  any  one  of  k possible  states),  2 items  are  ordered  (there  are*  nCj  choices 
of  the  2 items  and  each  of  the  remaining  items  can  be  in  any  one  of  k possible  states),  and  oth- 
ers. 

If  we  let  F(n.N)  represent  the  total  number  of  states  (ignoring  the  effects  of  Property  4), 
then  a continuation  of  the  above  argument  yields 


. »! 

1 2I(»-2)I 


148 


E.  A.  SILVER 


F(n.N)  - 1#  + *£  ( n )*~x  + "C2  J-2  + ...  + nj  + 1) 

J- 1 

- i + "£  l (nc0r  + »c,y*-'  + - C2J *-J  + . . . + 'C'-J' 

i- 1 

■ HZ’  iO\  _ or'  /mi  ( adding  and 
^nJ  1 oJ  1 subtracting  "CJ*) 

- 1 + "£  1(1 +;)"->") 

./-i 

- 1 + 12"  - 11  + [3*  - 2*1  + . . . + [AT*  - (Ar-l)"J 


- N". 


Thus,  dynamic  programming  is  only  feasible  for  fairly  small  values  of  n and  N.  Perhaps  the 
equivalent  formulation  as  a shortest-route  problem  might  extend  the  range  of  feasible  computa- 
tion somewhat  in  that,  as  discussed  by  Crabill  and  Jaquette  [3],  a properly  encoded  shortest- 
route  formulation  is  computationally  more  efficient  than  dynamic  programming. 


Table  2.  Details  of 
the  Solution  of  the  Numerical  Example 


n 

State  for 
Item  1 ,J  t 

State  for 
Item  2,  J2 

Best  fValuea  for  Given 
State  and  State 

Functional 

Value 

fM 

r wy 

r2Q,j) 

6 

I 

0 

l 

60 

0 

1 

1 

— 

50 

0 

0 

. 1 

i 

70 

J 

1 

0 

2 

120 

0 

2 

I 

— 

110 

0 

1 

1 

— 

120 

0 

0 

1 

2 

130 

4 

2 

0 

— 

1 

155 

I 

0 

— 

1.2,3 

190 

0 

3 

2 

— 

145 

0 

2 

2 

— 

145 

0 

1 

I 

— 

ISO 

0 

0 

2 

2,3 

163 

3 

3 

0 

3 

235 

2 

0 

— 

2 

245 

1 

0 

— 

1 

225 

C 

4 

3 

— 

235 

0 

3 

3 

— 

225 

0 

2 

2 

— 

235 

0 

1 

1 

— 

215 

0 

0 

1 

1 

235 

2 

1 

0 

— 

1.2 

295 

0 

2 

1 

— 

2S5 

0 

1 

I 

to 

215 

0 

0 

1 

1.2 

305 

0 

0 

I 

2 

365 

#ln  ih«  Ant  period  there  it  only  • tingle  title  that  need  be  contidered  because  of  the  prescribed  initial  conditions. 


COORDINATED  REPLENISHMENTS  UNDER  TIME-VARYING  DEMAND 


149 


Table  3.  Summary  of  the  Solution  to  the  Numerical  Example 
(A  - $40.  a,  - $10.  a2  - $20.  hx  - $3.  h2  - $ 1 ) 


Period 

m 

Item  1 

Item  2 

Setup 

Cost 

T\m 

Q\.m 

D..« 

/..* 

T'i.m 

Qj.m 

t>2 

1 

1 

10 

10 

0 

2 

20 

10 

10 

J70 

2 

1 

70 

70 

0 

— 

— 

10 

0 

50 

3 

1 

80 

80 

0 

1 

20 

20 

0 

70 

4 

2 

10 

5 

5 

3 

60* 

40 

20 

70 

5 

— 

— 

5 

0 

— 

— 

10 

10 

6 

1 

70 

70 

0 

— 

0 

10 

0 

50 

Total 

5 

40 

CIO 

Total  costs  - 31C 

1 + 3(5)  + 1(40) 

- 065. 

’There  i*  m alternative  optimal  aolution  with  t — SO  inatead  of  60  and  t — 10  inatead  of  0. 


Table  4 


Number  of  periods 
Remaining,  k 

Types  of  States 

Total  Number 
of  States 

1 

(1,0),(0, 0(0,0) 

3 

2 

(2,0), (1,0),(0,2),(0, 0,(0, 0) 

5 

k < N 

0 > 0,0), (0J  > 0),(0,0) 

2 k+  1 

N 

initial  state  (0,0) 

1 

7.  CONCLUSIONS 

In  this  paper  we  have  addressed  a difficult  inventory-control  problem,  namely  that  of  a 
group  of  items  having  time-varying  demand  patterns  and  being  linked  by  a coordinated  setup- 
cost  structure.  As  discussed  in  the  introduction,  there  are  a number  of  important  practical  con- 
texts that  can  be  represented  in  such  a problem  format.  A dynamic  programming  solution 
method,  suitable  for  small  groups  of  items,  has  been  developed  and  illustrated.  This  solution 
approach  is  thus  directly  applicable  to  small-scale  versions  of  the  aforementioned  realistic  prob- 
lems. The  demonstrated  proliferation  of  the  state  space  for  larger  groups  of  items  indicates  that 
heuristic  procedures  are  needed  (a  topic  of  ongoing  research).  Nevertheless,  for  the  testing  of 
such  heuristics,  it  may  be  reasonable  to  obtain  optimal  solutions  by  dynamic  programming  for  a 
few  test  examples  with  somewhat  larger  values  of  n. 

Important  extensions  of  the  basic  model  would  include  the  introduction  of  capacity  res- 
trictions and/or  quantity-discount  possibilities.  Either  of  these  extensions  dramatically  'compli- 
cates the  class  of  solutions  that  must  be  considered  because  Property  1 need  no  longer  be 
satisfied  by  the  optimal  solution. 


150 


E.  A.  SILVER 


ACKNOWLEDGMENT 

The  author  would  like  to  acknowledge  several  suggestions  made  by  an  anonymous  referee 
that  have  made  the  presentation  more  lucid. 


REFERENCES 

[I]  Blackburn,  J.,  and  H.  Kunreuther,  'Planning  and  Forecast  Horizons  for  the  Dynamic  Lot 

Size  Model  with  Backlogging,*  Management  Science  21,  251-255  (1974). 

[21  Brown,  R.O.,  Decision  Rules  for  Inventory  Management  (Holt,  Rinehart  and  Winston,  New 
York,  1967),  Chap.  5. 

[31  Crabill,  T.,  and  D.  Jaquette,  "A  Two  Product  Dynamic  Economic  Lot  Size  Production 
Model  with  Either-Or-Production  Constratints,'  Naval  Research  Logistics  Quarterly  21, 
505-513  (1974). 

[41  Crowston,  W.B.,  and  M.B.  Wagner,  "Dynamic  Lot-Size  Models  for  Multi-Stage  Assembly 
Systems*,  Management  Science  20,  14-21  (1973). 

[51  De  Matteis,  J.J.,  and  A.G.  Mendoza,  "An  Economic  Lot-Sizing  Technique,"  IBM  Systems 
Journal  7,  30-46  (1968). 

[61  Diegel,  A.,  "A  Linear  Approach  to  the  Dynamic  Inventory  Problem,"  Management  Sci- 
ence 12,  530-540  (1966). 

[71  Doll,  C.L.,  and  D.C.  Whybark,  "An  Iterative  Procedure  for  the  Single-Machine,  Multi- 
Product,  Lot  Scheduling  Problem,"  Management  Science  20,  50-55  (1973). 

[8]  Eppen,  G.D.,  F.J.  Gould,  and  B.P.  Pashigian,  "Extensions  of  the  Planning  Horizon 
Theorem  in  the  Dynamic  Lot  Size  Model,”  Management  Science  15,  268-277  (1969). 

[91  Florian,  M.,  and  M.  Klein,  "Deterministic  Production  Planning  with  Concave  Costs  and 
Capacity  Constraints,"  Management  Science  18,  12-20  (1971). 

[10]  Goyal,  S.K.,  "Determination  of  Optimum  Packaging  Frequency  of  Items  Jointly  Replen- 
ished," Management  Science  21,  436-443  (1974). 

[II]  Jagannathan,  R.,  and  R.  Rao,  "A  Class  of  Deterministic  Production  Planning  Models," 

Management  Science  19,  1295-1300  (1973). 

[121  Kunreuther,  H.,  and  T.  Morton,  "Planning  Horizons  for  Production  Smoothing  with 
Deterministic  Demands:  I,"  Management  Science  20,  110-125  (1973). 

[13]  Kunreuther,  H.,  and  T.  Morton,  "Planning  Horizons  for  Production  Smoothing  with 
Deterministic  Demands:  II,"  Management  Science  20,  1037-1046  (1974). 

[141  Lambrecht,  M.,  "Capacity  Constrained  Multi-Facility  Dynamic  Lot-Size  Problem",  Num- 
mer  19,  Faculteit  der  Economische  en  Toegepaste  Economische  Wetenschappen, 
Katholieke  Universiteit  Leuven,  Leuven,  Belgium  (1976). 

[15]  Love,  S.F.,  "Bounded  Production  and  Inventory  Models  with  Piecewise  Concave  Costs," 

Management  Science  20,  313-318  (1973). 

[16]  Lundin,  R.,  and  T.  Morton,  "Planning  Horizons  for  the  Dynamic  Lot  Size  Model:  Zabel 

vs  Protective  Procedures  and  Computational  Results,"  Operations  Research  23,  711-734 
(1975). 

[17]  Newsom,  P.,  "Multi-Item  Lot-Size  Scheduling  by  Heuristic,  Part  1:  With  Fixed 

Resources,"  Management  Science  21,  1186-1193  (1975). 

[18]  Sawaki,  K.,  "Some  Extensions  of  Dynamic  Economic  Lot-Size  Model  with  Backlogging," 

Journal  of  the  Operations  Research  Society  of  Japan  14,  1-18  (1971). 

[19]  Silver,  E.A.,  "A  Simple  Method  of  Determining  Order  Quantities  in  Joint  Replenishments 

under  Deterministic  Demand,"  Management  Science  22,  1351-1361  (1976). 

[20]  Silver,  E.A.,  and  H.C.  Meal,  "A  Heuristic  for  Selecting  Lot  Size  Requirements  for  the 

Case  of  a Deterministic  Time-Varying  Demand  Rate  and  Discrete  Opportunities  for 
Replenishment,"  Production  and  Inventory  Management  14,  64-74  (1973). 


i 


i 


i 


i 

i 


H 


\ 


COORDINATED  REPLENISHMENTS  UNDER  TIME-VARYINO  DEMAND 


151 


[21]  Swoveland,  C.,  "A  Deterministic  Multi-Period  Production  Planning  Model  with  Piecewise 

Concave  Production  and  Holding-Backorder  Costs,”  Management  Science  21,  1007- 
1013  (1975). 

[22]  Wagner,  H.,  "A  Postscript  to  ‘Dynamic  Problems  in  the  Theory  of  the  Firm’,"  Naval 

Research  Logistics  Quarterly  7,  7-12  (1960). 

[23]  Wagner,  H.,  and  T.M.  Whitin,  'Dynamic  Version  of  the  Economic  Lot  Size  Model," 

Management  Science  5,  89-96  (1958). 

[24]  Zabei,  E.,  "Some  Generalizations  of  an  Inventory  Planning  Horizon  Theorem,"  Manage- 

ment Science  10,  465-471  (1964). 

[25]  Zangwill,  W.I.,  "A  Backlogging  Model  and  a Multi-Echelon  Model  of  a Dynamic 

Economic  Lot  Size  Production  System  — A Network  Approach,”  Management  Science 
IS,  506-527  (1969). 


A NOTE  ON  DUALITY  IN  HOMOGENEOUS 
FRACTIONAL  PROGRAMMING 


B.  D.  Craven 

Mathematics  Department 
University  of  Melbourne 
Parkville,  Victoria,  Australia 

B.  Mond 

Mathematics  Department 
La  Trobe  University 
Bundooro,  Victoria,  Australia 


ABSTRACT 

For  a linear  fractional  programming  problem,  Sltarma  and  Swarup  have 
constructed  a dual  problem,  also  a linear  fractional  program,  in  which  the  ob- 
jective functions  of  both  primal  and  dual  problems  are  the  same.  Craven  and 
Mond  have  extended  this  result  to  a nonlinear  fractional  programming  problem 
with  linear  constraints,  and  a dual  problem  for  which  the  objective  function  is 
the  same  as  that  of  (he  primal.  This  theorem  is  now  further  extended  from 
linear  to  differentiable  convex  constraints. 

Duality  for  fractional  and  nonlinear  programming  has  been  treated  extensively  in  the 
recent  literature  — see  Schaible  14],  where  many  references  are  given.  Sharma  and  Swarup  [51 
gave  a formulation  of  linear  fractional  programming  where  the  dual  problem  is  a fractional  pro- 
gram with  the  same  objective  function  as  the  primal.  Craven  and  Mond  [11  extended  this  to  a 
nonlinear  fractional  program  with  linear  constraints.  The  result  is  now  extended  to 
differentiable  convex  constraints. 

Consider  the  two  fractional  programming  problems: 

(P):  Maximize  [fix)/g(x):  hix)  < 0,  x€Jf0}; 

(D):  Minimize  lf(u)/g(u):  v > 0,  «€*<,.  vr  [h'(u)u  - hiu)]  < 0. 

vTh'(u)  - g(u)fiu)  - f(u)g'(u)\. 

Here  *,  u€X0,  an  open  convex  subset  of  R";  f:X0  — R,  * : JT0  — R,  h.X0  — R"  are 
differentiable  functions,  with  gradients  denoted  fix).  g'Or).  and  h'(x).  (fix)  and  g'(x)  are 
tow  vectors;  h'(x)  is  an  m x n matrix).  Assume  that  x€X0  fix)  > 0,  and  #(x)  > 0,  and 
that  g Or)  - 0 ^ fix)  * 0. 

The  functions  / and  g will  be  assumed  homogeneous  of  the  same  degree  A;  thus,  for  all 

and  all  f€(0, <»),  /(or)  - /*/(*),  and  likewise  for  g.  From  this  follows  Euler's  relation: 
f(x)x  — A fix).  If  / is  concave  and  nonnegative,  and  g is  convex  and  postive,  then 
f(x)/g(x)  is  pseudoconcave  [2,31,  and  a local  maximum  of  f(x)/g(x)  is  a global  maximum. 
The  present  proof  proves  the  latter  fact  independently,  and  does  not  use  pseudoconcavity. 


154 


B.  D.  CRAVEN  AND  B.  MOND 


DUALITY  THEOREM:  Let  / and  g be  real  differentiable  functions,  each  homogeneous, 
of  the  same  degree  X,  with  / concave  and  g convex;  let  h be  convex  and  differentiable.  Then 
f(x)/g(x)  < f(u)/g(u ) whenever  x is  feasible  for  (/*)  and  ( u,v ) is  feasible  for  (D).  More- 
over, if  also  ( P ) attains  a finite  maximum  at  x - a.  and  if  the  Kuhn-Tucker  constraint 
qualification  holds  there,  then  there  exist  (u.v)  feasible  for  (D),  with  u — a.  (Thus  ( D ) is  a 
dual  problem  to  (/*).) 

PROOF:  Let  x be  feasible  for  (P),  and  (u.v)  for  (£».  Since  h is  convex, 
h(u)  + h'(u)(x  - u)  < h(x),  hence,  using,  the  constraints  of  ( D ), 

0 ^ vr  [h'(u)u  - h(u) J ^ vr  [ h'(u)x  - h(x)\ 

> vTh'(u)x  - g(u)f'(u)x  - f(u)g'(u)x  - <l>'(u)x. 

where  <£(•)  “ g(u)f(.)  - f(u)g(.)  is  concave  and  homogeneous,  since  u€X0  + g(u ) > 0 
and  fiu)  > 0.  Hence  d>'iu)x  > ^(x)  — 0(u)  + <t>'(u)u  — <l>(x ) + (X  — 1 )<t>(u)  — 0(x)  + 0. 
Thus,  0 > g(u)f(x)  - f(u)g(x).  Hence,  if  g(u)  > 0 and  g(x)  > 0,  then 

(1)  f(u)/g(u ) > f(x)/g(x). 

The  latter  also  follows  when  g(u)  - 0,  for  then  f(u)  v*  0,  so  f(u ) > 0,  and  f(u)/g(u ) — 
+ <».  But  if  g(u)  * 0 and  g(x)  - 0,  then  g(u)f(x)  < 0,  but  also  giu)  > 0 and  fix)  > 0,  a 
contradiction. 

Now  let  ( P ) attain  a finite  maximum  at  x - a , and  let  the  Kuhn-Tucker  constraint 
qualification  hold  there.  Then  the  Kuhn-Tucker  theorem  applied  to  ( P ),  or  Schaible  [4,  Prop. 
4|,  shows  that  the  last  constraint  of  ( D ) is  satisfied  for  u - a and  some  Lagrange  multiplier 
v - v > 0;  and  also  that  vTh(a)  — 0.  Then  also 

vT\h'(a)a  - h(a ))  - vTh'(a)a  - g(a)f(a)a  - f(a)ag'(a)a 
- g(a)\f(a)  - f(a)\g(a)  - 0, 

if  we  use  the  last  constraint  of  ( D ),  and  homogeneity.  Hence,  ( a.v ) is  feasible  for  (£>),  and 
the  objective  functions  of  (P)  and  (D)  are  equal  there. 

REMARKS:  If  g(x)  is  a linear  function  dTx,  then  4 is  concave  without  the  requirement 
that  utX0+-  f(u)  >0.  Weak  duality  then  follows,  as  in  (5J,  assuming  however  that 
f(x)/g(x)  < oo  whenever  x is  feasible  for  (P).  If  g(x)  > 0 and  giu)  > 0,  the  proof  is  as 
before.  If  giu)  — 0 and  gix)  > 0 then  0 > - f(u)gix)  gives  fiu)  > 0,  and  f(u)/g(u)  - 
+ oo;  if  g(u)  > 0 and  gix)  — 0,  then  giu)f(x)  < 0 gives  fix)  < 0,  and  f(x)/g(x)  " 
- oo.  if  g(u)-gix)  - 0 then  fix)  * 0'by  assumption;  since  f(x)/g(x)  ^ + oo  by  assump- 
tion, fix) / gix)  — - oo.  So  (1)  follows  in  each  case. 

If  /,  g.  and  h are  affine  functions,  then  the  constraints  of  (D)  are  also  affine  functions. 
Sharma  and  Swarup  15]  consider  fix)  - cTx  and  *(x)  - dTx  and  prove  a duality  theorem, 
assuming  that  d ^ 0.  They  state  an  extension  where,  instead,  dTx  > 0 for  each  x feasible  for 
(P);  this  corresponds  here  to  x€>ir,  ^ dTx  > 0.  However,  the  dual  in  [5]  omits  the  dual  con- 
straint dTu  > 0.  Example  2 in  [5]  gives  the  following  two  problems: 

(PI):  Maximize  I — ^ : x,  >0,  x2>0,  2x,  + 3x2<6,  x,  -,x1>  ll; 

*rxi  I 


r 


A NOTE  ON  DUALITY  IN  HOMOGENEOUS  FRACTIONAL  PROGRAMMING  155 

(2u,  + 3 u2 

u _ u — :wi-  m2>  F|,  Vj  >0,  2v,  - v2  + 5u2  >0, 

3 + v2  — SU]  >0,  -6vj  + v2  > 0 

If  to  (£>1)  is  adjoined  the  constraint  u,  - u2  > 0,  then  (Dl)  becomes  indeed  a dual  to  (PI). 
However,  (PI)  atuins  its  optimal  value  of  6 at  jc , — 9/5,  x2  - 4/5;  whereas  (Z)l),  without  the 
extra  constraint,  has  a feasible  solution  v,  - 0,  v2  - 5,  ux  - 0,  u2  - 1,  for  which  the  objec- 
tive function  has  the  value  -3  < 6,  contradicting  duality. 

Examples  of  nonlinear  functions  satisfying  the  hypotheses  of  the  theorem  have  been 
given  in  111.  If  h(x)  - Ax  - b,  where  A € R'"x*  and  b € Rm,  then  the  result  of  [1]  is 
recovered. 

If  the  constraint  h(x)  <0  is  generalized  to  h (x)  € - S,  where  S is  a closed  convex  cone, 
then  the  theorem  remains  valid,  with  v > 0 replaced  by  v 6 S*,  the  dual  cone  of  S,  if  the  con- 
vex cone 

[h(a)\h'(a)]T(S') 

is  assumed  closed.  The  latter  hypothesis  is  required  for  the  Kuhn-Tucker  theorem;  it  holds 
automatically  if  S is  polyhedral.  This  cone  version  remains  valid  if  R"  is  replaced  by  a Banach 
space  of  infinite  dimension. 


REFERENCES 

111  Craven,  B.D.,  and  B.  Mond,  "Duality  for  Homogeneous  Fractional  Programming,”  Cahiers 
du  Centre  d’Etude  de  Recherche  Opdrationelle  18  (4),  413-417,  (1976). 

[2]  Mangasarian,  O.L.,  "Nonlinear  fractional  programming,"  Journal  of  the  Operations  Research 
Society  of  Japan  12,  1-10,  (1969). 

13)  Mangasarian,  O.L.,  Nonlinear  Programming  (McGraw-Hill,  New  York,  1969),  Chapter  9. 

[4]  Schaible,  S.,  "Fractional  Programming  I,  Duality,"  Management  Science  22,  858-867  (1976). 

[5]  Sharma,  I.C.,  and  K.  Swarup,  "On  Duality  in  Linear  Fractional  Functionals  Programming," 

Zeitschrift  fur  Operations  Research  16,  91-100,  (1972). 


A NOTE  ON  A MODIFIED  BLOCK  REPLACEMENT  POLICY  FOR 
UNITS  WITH  INCREASING  MARGINAL  RUNNING  COSTS 


I 


( * 
I t 


I, 


Menachem  Berg  and  Benjamin  Epstein 

Haifa  University  and  Technion 
lsreal  Institute  of  Technology 
Haifa,  Israel 

ABSTRACT 

The  model  for  a modified  block  replacement  policy  (MBRP)  is  extended  to 
include  running  costs.  An  illustrative  example  is  worked  out  for  the  case  when 
item  life  is  exponentially  distributed  and  marginal  running  cost  per  unit  time  in- 
creases linearly  with  the  age  of  the  item. 

INTRODUCTION  AND  OUTLINE  OF  A 
MBRP  WITH  RUNNING  COSTS 

In  a previous  paper  [3]  we  provided  the  theory  for  a modified  block  replacement  policy 
(MBRP).  In  this  note  we  introduce  the  additional  assumption  that  it  costs  more  to  run  a unit 
the  older  it  becomes  (for  recent  replacement  models  involving  this  assumption  see  Refs.  [I], 
[2],  (4],  [61,  [71,  and  [8]).  More  precisely,  we  will  postulate  a function  a (x),  the  marginal  run- 
ning cost  per  unit  time  of  an  item  having  age  x.  It  is  assumed  that  a (x)  is  an  increasing  func- 
tion of  x.  As  in  [31  the  objective  function  that  we  wish  to  minimize  is  the  expected  cost  per 
unit  time,  in  the  long  run,  of  using  a MBRP. 


H 


The  objective  function  in  this  note  is  given  by 


(1) 


C(b,t)  - 


c}ExlMxU)  1 + ExlAxU)  1 + Cj/(0) 


where  Ax(t)  is  the  expected  running  cost  in  a block  interval  of  length  f if  the  item  is  of  age  x at 
the  beginning  of  the  interval.  All  other  symbols  and  expressions  used  in  (1)  have  precisely  the 
same  meaning  and  description  as  in  section  2 of  [3]. 


The  function  Ax(.t)  satisfies  the  modified  renewal  equation 


(2) 

where 


Ax(l)  - f Fx(u)  fl(x+u)  du  + f A(t-u)  dFx(u), 

**  0 w o 


(3) 


A(t)  - f F(u)  a(u)  U+MU-u)]  du. 


Equation  (3)  is  obtained  from  (2)  by  setting  x - 0 and  then  applying  Theorem  3C  in  section 
3.3  of  [31.  The  expected  running  cost  for  a block  interval  of  length  t is  given  by 


4 


t 


157 


1S8 


M.  BERO  AND  B.  EPSTEIN 


(4)  Ex[Axit)\  - /o*  /(x)  Ax(l)  dx  + 7(0)  A it). 

A method  for  computing  f(x)  end  7(0),  the  stationary  age  distribution  of  a unit  at  the  begin- 
ning of  a block  interval,  was  developed  in  [3]  under  a factorization  assumption  for  fit+y)  (see 
formulas  12,  24,  and  25  of  [3]  for  details). 

It  should  be  noted  that  a MBRP  with  running  costs  reduces  to  a BRP  with  running  costs 
when  6-0.  In  this  case  (1)  becomes 


(1) 


C(0.r) 


C|JI#t.O  + A it)  + Cj 
t 


Optimal  BRP  policies  in  the  presence  of  running  costs  are  treated  in  (8]  for  several  special 

cases. 


AN  EXAMPLE  MB1P  WITH  RUNNING  COSTS 

In  [3]  we  illustrated  the  computation  of  fix)  and  7(0)  when  item  life  follows  a two-stage 
Erlang  distribution.  These  results  can  be  used  to  obtain  £,(dx(t)l  and  hence  Cib.t)  for  this 
particular  distribution.  The  detailed  calculations  and  the  subsequent  evaluation  of  ib*.t*)  are 
left  to  the  reader. 

In  order  to  highlight  the  way  in  which  the  consideration  of  running  costs  may  affect  the 
choice  of  ib*,t*)  we  shall  use  another  example  involving  the  exponential  lifetime  p.d.f. 
fix)  - \e~kx,  x > 0.  In  this  case  it  is  only  the  assumption  of  an  increasing  marginal  running 
cost  which  justiAes  carrying  out  planned  replacements.  The  particular  running  cost  function 
which  we  shall  use  in  the  sequel  is  a (x)  - px,  x > 0,  with  p > 0. 

For  the  exponential  distribution,  for  which  the  factorization  of  fit+y)  holds  trivially,  we 
readily  obtain 

(5)  7(x)  - 0 < x < 6;  7(0)  - e-kb. 

a result  which  could  have  been  obtained  as  a direct  consequence  of  the  memoryless  property  of 
the  exponential  distribution.  Hence, 

(6)  Mxit)  - Mil)  - Xf,  t > 0. 


Direct  substitution  in  (4)  yields 

(7)  Ait)  - -jj  l-l  + Xf  + e-*']. 

Inserting  (7)  in  (2),  we  obtain 

(g)  Axit)  - Ait)  + i 1 -*-*'). 

If  we  substitute  (7)  and  (8)  in  (4)  we  get 


(9)  Ex[Axit)\  - £ IX/ -e-^l  + xaXI -a"*')!. 

The  substitution  of  (5),  (6),  and  (9)  in  (1)  yields 


A NOTE  ON  A MODIFIED  BLOCK  REPLACEMENT  POLICY 


159 


(10) 


C(b.t) 


Taking  partial  derivatives  of  C(.b,t ) with  respect  to  b and  ( and  equating  to  zero,  we  obtain 
the  pair  of  equations 

(11)  b - CjX/pO -e~kl) 
and 

(12)  k2cj/p-  (l  + X6)[l-<rA'(l+X/)J. 

Substituting  (11)  into  (12),  we  obtain  the  following  equation  in  t : 

(13)  X’cj/P  - 1)  (1  ~ *'x,(l  + Xf))/X/. 


I 

The  right  hand  side  of  (13)  is  a monotonically  increasing  function  of  t going  from  zero, 
when  / — 0,  to  infinity,  as  / -7  00.  Hence  (13)  has  a unique  solution  t*.  Inserting  /’into  (11) 
we  get  b*,  which  is  also  unique.  An  analysis  of  (11)  and  (13)  shows  that  b*  and  t*  are  each 
increasing  functions  of  c2/p.  In  particular,  b*  — °»  and  t*  — 00  when  c2/p  — 00,  and  b*  — 0 
and  t*  — 0 when  cj/p  — » 0. 


Recalling  that  a BRP  is  a special  case  of  a MBRP  when  b - 0,  we  set  b - 0 in  (12)  and 
obtain 

(14)  xVp  - l-*-x'(l  + Xf) 

as  the  equation  for  the  optimal  block  interval  for  the  BRP. 

The  right  hand  side  of  (14)  is  a monotonically  increasing  function  of  t going  from  zero, 
for  t «■  0,  to  one,  for  1 — «>.  Hence  a unique  finite  solution  r0  exist*  if  X2Cj/p  < 1.  Other- 
wise, t0  — 00  and  the  optimal  BRP  becomes  a failure  replacement  policy  (FRP).  It  is 
worthwhile  to  note  that  unlike  the  BRP,  there  is  always  a finite  solution  for  a MBRP.  It  is  evi- 
dent that  we  can  always  do  better  with  an  optimal  MBRP  than  with  an  optimal  BRP.  We  have 
examined  numerically  the  possible  savinp  for  the  case  of  an  exponential  life  distribution  with 
c,  - c2  - c - 0.1  (0.1)  0.9  and  p - 1.  It  turns  out  that  the  greatest  savings  in  using  the  best 
MBRP  rather  than  the  best  BRP  is  stained  at  c - 0.7  and  that  this  savinp  is  almost  8%.  We 
would,  of  course,  expect  even  higher  savinp  with  increasing  failure  rate  (IFR)  lifetime  distri- 
butions and  with  marginal  running  cost  functions  which  increase  faster  than  the  linear  function 
considered  here. 


REFERENCES 

[1]  Berg,  M.,  ’Optimal  Replacement  Policies  for  Two  Unit  Machines  with  Increasing  Running 

Costs  — I.*  Stochastic  Processes  and  Their  Applications  4,  89-106  (1976). 

(2)  Berg,  M.,  'Optimal  Replacement  Policies  for  Two  Unit  Machines  with  Increasing  Running 
Costs  — II,’  Stochsstic  Processes  and  Their  Applications  5,  315-322  (1977). 

[3]  Berg,  M.,  and  Epstein,  B.,  "A  Modified  Block  Replacement  Policy,’  Naval  Research  Logis- 

tics Quarterly  23,  15-24  (1976). 

(4)  Cldroux,  R.,  and  Hanscom,  M.,  *Ap  Replacement  with  Adjustment  and  Depreciation  Costs 

and  Interest  Charges,”  Technometrics,  16,  235-239  (1974). 

(S|  Parzen,  E.,  Stochastic  Processes  (Holden-Day,  San  Francisco,  1962). 


* ^ 


160 


M.  BERG  AND  B.  EPSTEIN 


[6]  Ran,  A.,  and  Roaenlund,  S.I.,  "Age  Replacement  with  Discounting  for  a Continuous 
Maintenance  Model,"  Technometrics  18,  459-465  (1976). 

[7]  Schaeffer,  R.L.,  "Optimum  Age  Replacement  Policies  with  an  Increasing  Cost  Factor," 
Technometrics  IS,  139-144  (1971). 

[S]  Tilquin,  C.,  and  Clcfroux,  R.,  "Block  Replacement  Policies  with  General  Cost  Structures,” 
Technometrics  17,  291-298  (1975). 


) 


I 


A NOTE  ON  OPTIMAL  INVENTORY  MANAGEMENT  UNDER  INFLATION 


Ram  B.  Misra 


Bell  Telephone  Laboratories 


ABSTRACT 

This  paper  develops  a discounted-cost  model  that  is  similar  to  the  classical 
economic  order  quantity  model  but  includes  inflation  rates  as  parameters  of  the 
inventory  system.  A numerical  problem  is  solved  to  illustrate  the  effects. 


Most  of  the  literature  in  the  field  of  inventory  management  has  not  included  inflation  as  a 
parameter  of  the  system.  This  has  happened  mostly  because  of  the  belief  that  inflation  (which 
was  quite  low  in  the  United  States  prior  to  the  1970’s)  would  not  influence  the  policy  variables 
to  any  significant  degree.  In  1975,  Misra  [S]  and  Buzacott  [2]  developed  economic-order* 
quantity  (EOQ)  models  which  incorporated  inflationary  effects  into  the  model.  The  models 
assume  a uniform  inflation  rate  for  all  the  costs  and  minimize  the  average  annual  cost  to  derive 
an  expression  for  the  EOQ.  It  was  also  shown  that  if  the  unit  selling  price  is  changed  only  at 
the  beginning  of  each  cycle  (as  practiced  by  many  grocery  stores,  i.e.,  charge  more  if  you  pay 
more),  the  objective  function  should  be  maximization  of  profit  instead  of  minimization  of  cost. 
In  the  situation  in  which  the  selling  price  is  increased  continuously  at  the  inflation  rate,  minim- 
izing cost  also  maximizes  profit.  In  a recent  paper,  Bierman  and  Thomas  [1]  have  proposed  an 
inflation  model  for  the  EOQ  which  also  considers  the  time  value  of  money.  They  too  have 
assumed  a single  inflation  rate  for  all  cost  factors.  The  cost  equation  in  the  model  does  not 
lend  itself  to  the  derivation  of  an  expression  for  the  EOQ,  therefore  the  authors  have  suggested 
the  use  of  a search  method.  Misra  and  Wortham  [6]  encountered  a similar  cost  equation  and 
suggested  an  approximation  to  derive  an  expression  for  the  EOQ.  For  various  problems  the 
approximate  EOQ  was  found  to  be  within  1%  of  the  exact  EOQ.  The  purpose  of  this  paper  is  to 
present  a model  which  considers  the  time  value  of  money  and  different  inflation  rates  for  vari- 
ous costs  associated  with  an  inventory  system. 

THE  PROPOSED  GENERAL  EOQ  MODEL 

In  the  analysis  of  an  inventory  system,  normally  three  types  of  costs  are  considered. 
These  are  replenishment  cost,  inventory  carrying  coat,  and  shortage  cost.  In  the  basic  model 
shortages  are  not  allowed,  so  only  the  first  two  coats  are  included  in  the  analysis.  Purchasing 
cost  is  not  included,  because  it  is  constant.  This  is  not  so  if  we  consider  inflation,  hence  this 
cost  will  be  included  in  the  analysis.  The  most  general  and  realistic  model  will  be  the  one 
which  considers  a separate' inflation  rate  for  each  of  its  cost  components  [3,4],  Writing  a coat 
expreaaion  for  such  s model  is  straightforward,  but  its  optimization  is  very  difficult,  and  will 
require  the  use  of  search  procedures  (7].  However,  one  can  put  these  costs  into  two  categories; 


161 


162 


R.  B M1SRA 


category  1 consists  of  all  those  costs  which  increase  at  the  inflation  rate  that  prevails  in  the 
company,  and  category  2 consists  of  those  that  increase  at  the  inflation  rate  of  the  general  econ- 
omy or  of  the  supplier  company.  These  will  be  called  the  internal  (company)  and  external 
inflation  rates  respectively.  Their  values  can  be  arrived  at  by  some  form  of  averaging  (simple 
or  weighted)  of  the  individual  inflation  rates  of  costs  in  each  category. 

In  general,  replenishment  cost  will  increase  at  the  internal  inflation  rate  and  the  unit  pur- 
chasing cost  at  the  external  inflation  rate.  The  cost  of  carrying  inventory  consists  of  the  oppor- 
tunity cost  and  the  real  out-of-pocket  costs  such  as  costs  of  insurance,  taxes,  and  costs  of 
storage.  The  amount  of  capital  tied  up  in  inventory  changes  with  the  unit  cost,  which  increases 
with  the  external  inflation  rate.  The  cost  of  storage  can  be  in  either  category  or  in  both, 
depending  on  whether  the  company  owns  the  storage  space,  or  rents  it,  or  both.  Van  Hees  and 
Monhemius  (4,  pp.  81-101]  have  given  an  excellent  breakdown  of  the  various  costs,  which  can 
be  used  as  a guide  in  categorizing  them  along  the  lines  suggested  here.  The  classification  would 
also  vary  depending  on  whether  the  goods  are  ordered  from  outside  or  are  manufactured  within 
the  company.  For  instance,  if  goods  are  manufactured  within  the  company,  the  unit  cost  is 
governed  by  both  the  internal  and  external  inflation  rates.  This  is  so  because  part  of  the  unit 
cost  (material  cost,  for  instance)  increases  with  the  external  rate  and  part  (setup  cost  + direct 
costs  incurred  in  production)  with  the  internal  rate.  Thus,  while  a clear-cut  categorization  of 
these  costs  is  generally  not  possible,  for  a given  inventory  system  it  can  easily  be  done.  In  the 
formulation  that  follows,  it  is  assumed  that  this  has  been  done  and  the  corresponding  costs 
determined.  Also,  it  will  be  assumed  that  the  costs  vary  with  instantaneous  inventory  level. 
One  can  include  additional  terms  if  some  costs  depend  on  the  maximum  inventory. 


Foraiilattan 

The  present  worth  of  the  total  cost  for  the  first  cycle  is 

(1)  Pi- Qc  + yf+c,/o°/X  (Q-kt)el',e-',dt  + c,  /q°A  (Q-U'ie1'  e~ndt  , 

where  Q — reorder  quantity,  X — demand  per  unit  time,  A — ordering  cost,  c — unit  cost,  c, 
- internal  inventory  cost  per  unit  per  unit  time,  /t  * internal  inflation  rate,  r — discount  rate 
or  cost  of  capital,  c2  “ external  inventory  cost  per  unit  per  unit  time,  and  i2  “ external 
inflation  rate. 


Equation  (1)  simplifies  to 


(2) 


Pi 


Qc  + A 


a-.-*1**) 

"i  Rf  Ri  R{ 


where  Rt  — r-ii  and  R2-  r-i2. 


X 


For  convenience 
Ex-  A 
‘The  cost  diagram 


let  us  define  £( 
ci Q c,X 

Ri  *,J 

for  N cycles  is 


and  E2  such  that 
(l-e~*,<?A).  E2  — 


Qc  + 


QCl  *C J , 

Ri  Ri{ 


A NOTE  ON  OPTIMAL  INVENTORY  MANAGEMENT 


163 


Ex 

Exe* 

Ex' 

+ 

+ 

+ 

+ 

E2 

Ej* 

Ej' 

N 


The  present  worth  of  the  total  coat  for  AT  cycles  is 

(3)  PT-  £,(l+e'*,'+e",*,'+  ...  ) + £,(l+*_**'+  + ...  ) 

fyf+cQ/*,  c,x||.  . [Qc+Qci/Ri  CjX  )|,  .-*>^1 

The  total  cost  in  equation  (3)  will  converge  if  R\  and  Rt  are  positive,  i.e.,  the  inflation  rates 
are  smaller  than  the  discount  rate,  even  for  the  infinite  planning  horizon,  N — oo. 

If  the  inflation  rates  are  higher  than  the  discount  rate,  the  total-cost  equation  (3)  is 
unbounded  as  N — * Thus,  a finite  horizon  must  be  used  for  optimization,  which  we  accom- 

plish by  differentiating  equation  (6)  with  respect  to  Q,  equating  it  to  zero,  and  solving  for  Q. 
This  yields  a complicated  expression  which  cannot  be  solved  for  Q directly  and  requires  the  use 
of  search  techniques.  In  this  situation,  if  the  costs  ct  and  c2  are  zero,  it  is  optimal  to  have  Q as 
large  as  possible.  This  is  not  a stable  situation.  To  have  finite  Q,  either  the  inflation  rates 
should  be  less  than  the  discount  rate,  or  C|  and  c2  should  be  very  high.  The  length  of  planning 
horizon  N will  be  determined  by  the  forecast  of  the  period  before  which  the  inflation  rates  will 
become  less  than  the  discount  rate.  In  a planning  horizon  of  unit  time  there  are  X./Q  cycles, 
i.e.,  N - X/Q . For  this  case  equation  (3)  yields 

„ M+CiO/rti  C|XK.  . ( Qc' 


A+CjQ/Rj  c,X 
!_*-*!<?/*  R{ 


R} 


where  c'  - c + — . 

*2 

Case  when  ( it  and  /2 ) < r 

In  this  case,  the  cost  equation  (4)  will  be  differentiated  with  respect  to  Q and  equated  to 
zero.  This  yields 

£L(l-e-*,0A)  - U+ctf/Rj  e~*l0,K 

(5)  K ’ — -k  o/x  , 

(1— e *,<?/V 

- 0c  ' ^ e~n'olx 

. 0 

(l-e”*,0/V 


where  K - (1— e-*2). 


164 


R.  B.  MISRA 


Equation  (S)  can  be  solved  by  the  use  of  search  techniques.  However,  an  approximate 
analytical  solution  can  be  obtained  if  we  expand  the  exponential  terms  up  to  the  first  three 
terms  and  neglect  the  higher-order  terms.  This  approximation  has  been  found  to  yield  good 
results  in  other  situations  with  similar  expressions  [6].  After  considerable  simplification  equa- 
tion (S)  reduces  to 


(6) 


_£l  _ A * + C'K 

2A  R{Q>  2R  ,A 


0 . 


Equation  (6)  yields 

(7)  Q • - where  /'  - R ,(1+-^-  + ■ 


I'  can  be  called  an  adjusted  inventory  carrying  cost,  following  the  terminology  of  Hadley 
and  Whitin  [2].  Thus,  in  practice  all  that  is  needed  is  to  calculate  /'and  use  it  in  place  of  / in 
the  Harris-Wilson-Camp  formula  [2,3].  This  does  not  give  the  optimum  Q,  but  the  approxima- 
tion is  quite  good  as  will  be  seen  later  in  an  example.  To  find  the  optimum  Q by  search  tech- 
niques, we  can  use  this  approximate  value  as  a starting  point. 


EXAMPLE 


Given  A — 10,000  units/year,  A — $40,  c — $4.00,  r — 0.20,  i|  » 0.08,  »2  — 0.14,  c,  — $0.20 
per  unit  time,  and  c2  - $0.16  per  unit  per  unit  time,  then  /'-  0.153,  and 

To  check  the  accuracy  of  the  approximation,  the  exact  value  of  Q was  calculated  from 
equation  (5)  by  trial  and  error.  The  exact  value  of  Q obtained  was  1160,  thus  the  approxima- 
tion is  quite  good. 


The  corresponding  Q from  the  Harris-Wilson-Camp  formula  is 


Q\ 


2x10,000x40 

4x0.2+0.20+0.16 


831  . 


Thus,  as  a result  of  inflation  the  optimum  order  quantity  has  increased.  The  correspond- 
ing costs  are 

Pr-  $33, 106  for  Q - 831 
and 

PT  - $32,822  for  Q - 1148  . 


In  summary,  the  optimum  order  quantity  is  changed  significantly  when  inflation  is 
included  in  the  analysis.  However,  the  reduction  in  costs^s  slight.  The  cost  function  in  the  EOQ 
model  is  known  to  be  insensitive  in  the  neighborhood  of  the  optimum  Q.  It  is  even  less  sensi- 
tive when  given  in  present-worth  terms.  A further  extension  of  this  research  is  the  interesting 
case  in  which  lead  time  is  significant.  Since  the  time  value  of  money  is  considered  in  the 
model,  the  payment  policy,  i.e.,  whether  the  payments  are  made  in  advance  or  at  the  time  of 
delivery,  will  influence  the  model. 


) 


A NOTE  ON  OPTIMAL  INVENTORY  MANAGEMENT 


165 


REFERENCES 

[1J  Bierman,  Harold,  and  J.  Thomas,  "Inventory  Decisions  Under  Inflationary  Conditions," 
Decision  Sciences  8,  No.  1 (1977). 

[2]  Buzacott,  J.A.,  "Economic  Order  Quantities  with  Inflation."  Operational  Research  Quarterly 

26,  No.  3 (1975). 

[3]  Hadley,  G.,  and  T.M.  Whitin,  Analysis  of  Inventory  Systems,  (Prentice-Hall,  Englewood 
Cliffs,  New  Jersey,  1963). 

[4]  Van  Hees,  R.N.  and  W.  Monhemius,  " Production  and  Inventory  Control:  Theory  and  Practice," 

(Barnes  and  Noble,  New  York,  1972)  pp.  81-101. 

15]  Misra,  R.B.,  "A  Study  of  Inflationary  Effects  on  Inventory  Systems,"  Logistics  Spectrum  9 
No.  3 (1975). 

[6]  Misra,  R.B.,  and  A.W.  Wortham,  "The  EOQ  Model  with  Continuous  Compounding," 
OMEGA  — The  International  Journal  of  Management  Science  5,  No.  1 (1977). 

[7]  Wilde,  D.J.,  "Optimum  Seeking  Method?  (Prentice-Hall,  Englewood  Cliffs,  New  Jersey, 


NEWS  AND  MEMORANDA 


THE  1978  LANCHESTER  PRIZE 
Call  for  Nomination* 

Each  year  since  1954  the  Council  of  the  Operations  Research  Society  of  America  has 
offered  the  Lanchester  Prize  for  the  best  English-language  published  contribution  in  operations 
research.  The  Prize  for  1978  consists  of  12,000  and  a commemorative  medallion. 

The  screening  of  books  and  papers  for  the  1978  Prize  will  be  carried  out  by  a committee 
appointed  by  the  Council  of  the  Society.  To  be  eligible  for  consideration,  the  book  or  paper 
must  be  nominated  to  the  Committee.  Nominations  may  be  made  by  anyone;  this  notice  con- 
stitutes a call  for  nominations. 

To  be  eligible  for  the  Lanchester  Prize,  a book,  a paper  or  a group  of  books  or  papers 
must  meet  the  following  requirements; 

(1)  It  must  be  on  an  operations  research  subject, 

(2)  It  must  carry  a current  award  year  publication  date  or,  if  a group,  at  least  one 
member  of  the  group  must  carry  a current  award  year  publication  date, 

(3)  It  must  be  written  in  the  English  language,  and 

(4)  It  must  have  appeared  in  the  open  literature. 

The  book(s)  or  paper(s)  may  be  a case  history,  a report  of  research  representing  new 

results,  or  primarily  expository. 

For  any  nominated  set  (e.g.,  article  and/or  book)  covering  more  than  the  most  recent 
year,  it  is  expected  that  each  element  in  the  set  represents  work  from  one  continuous  effort, 
such  as  a multi-year  project  or  a continuously-written,  multi-volume  book. 

Judgments  will  be  made  by  the  Committee  using  the  following  criteria; 

(1)  The  magnitude  of  the  contribution  to  the  advancement  of  the  state  of  the  art  of 
operations  research, 

(2)  The  originality  of  the  ideas  or  methods, 

(3)  New  vistas  of  application  opened  up, 

(4)  The  degree  to  which  unification  or  simplification  of  existing  theory  or  method  is 
achieved,  and 

(5)  Expository  clarity  and  excellence. 


I 

I 


16S 

Nominations  should  be  sent  to: 

George  L.  Nemhauser,  Chairman 
1978  Lanchester  Prize  Committee 
School  of  Operations  Research 
and  Industrial  Engineering,  Upson  Hall 
Cornell  University 
Ithaca,  NY  14853 

Nominations  may  be  in  any  form,  but  must  include  as  a minimum  the  title(s)  of  paper(s) 
or  book,  author(s),  place  and  date  of  publication,  and  six  copies  of  the  material.  Supporting 
statements  bearing  on  the  worthwhileness  of  the  publication  in  terms  of  the  five  criteria  will  be 
helpful,  but  are  not  required.  Each  nomination  will  be  carefully  screened  by  the  Committee. 
Nominations  must  be  received  by  May  30,  1979,  to  allow  time  for  adequate  review. 

Announcement  of  the  results  of  the  Committee  and  ORSA  Council  action,  as  well  as 
award  of  any  prize(s)  approved,  will  be  made  at  the  56th  National  Meeting  of  the  Society, 
October  21-24,  1979,  in  Milwaukee. 


1978  Lanchester  Prize  Committee 

Professor  George  Nemhauser,  Chairman ; Dr.  Daniei  Heym an.  Dr.  Ralph  Keeney,  Profes- 
sor Leonard  Kleinrock,  Professor  Peter  Kolesar,  Professor  Donald  Ratliff. 


if  U.  S.  GOVERNMENT  PRINTING  OFFICEt  1979  — 281-491/2 


